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Introduction 

During  the  past  decade,  we  witnessed  an  extraordinary  evolution  in  surgical  care 
based  upon  rapid  advances  in  technology  and  creative  approaches  to  medicine.  The 
increased  speed  and  power  of  computer  applications,  the  rise  of  visualization 
technologies  related  to  imaging  and  image  guidance,  improvement  in  simulation-based 
technologies  (tissue  properties,  tool-tissue  interaction,  graphics,  haptics,  etc)  has  caused 
an  explosion  in  surgical  advances.  That  said,  we  remain  far  behind  scientists  in  applying 
information  systems  to  patient  care.  This  research  effort  has  proceeded  under  the  mantle 
of  “Operating  Room  of  the  Future”  research.  We  have  replaced  that  theme  with  the  more 
appropriate  “Innovations  in  the  Surgical  Environment.” 

The  content  of  this  final  report  contains  information  pertinent  to  continued 
activities  in  relation  to  the  contract  DAMD- 17-03-2-0001,  “Advanced  Technologies  in 
Safe  and  Efficient  Operating  Rooms”  work.  This  initial  research  endeavor  underpinned 
and  was  scoped  to  fit  seamlessly  into  a  continuing  project,  W81XWH-06-2-005, 
“Advanced  Technologies  in  Safe  and  Efficient  Operating  Rooms”. 

The  current  research  project  was  initially  based  upon  three  pillars  of  research,  OR 
Informatics,  Simulation  for  Training  and  Smart  Image.  A  fourth  pillar,  cognitive 
ergonomics/human  factors,  was  added  during  the  past  year.  At  the  beginning  of  this 
period  of  performance,  there  were  five  projects  that  comprised  the  Informatics  pillar  and 
two  for  Smart  Image.  The  Simulation  pillar  has  been  comprised  of  a  single  project,  The 
Maryland  Virtual  Patient.  Going  forward,  this  pillar  will  be  expanded  to  include  research 
conducted  in  and  for  the  larger  Simulation  Training  Program  in  the  MASTRI  Center. 

Last  summer,  we  convened  our  annual  conference,  Innovations  in  the  Surgical 
Environment,  a  meeting  that  serves  as  a  deep  recapitulation  of  research  performed  under 
the  contract.  Additionally,  the  conference  presents  an  opportunity  to  explore  innovative 
approaches  to  surgical  research  with  government,  academic  and  industry  partners,  and 
expands  our  capability  to  develop  collaborative  relationships.  This  year,  the  conference 
theme  was  lessons  learned  from  the  high-stakes  environments  of  aviation  and 
astronautics  applied  to  the  high-stakes  environment  of  the  operating  room. 

Body 

A.  OR  Informatics 

Informatics  subgroup  1.  The  Perioperative  Scheduling  Study  (WORQ) 

The  Perioperative  Scheduling  Study  examined  how  using  post-operative 
destination  information  during  the  process  of  surgery  scheduling  can  influence 
congestion  in  postoperative  units  such  as  ICUs  and  IMCs,  which  lead  to  overnight 
boarders  in  the  PACU.  The  research  team  is  composed  of  Jeffrey  W.  Herrmann,  Ph.D., 
and  Greg  Brown,  a  graduate  student,  both  with  the  University  of  Maryland,  College  Park. 
The  team  is  working  closely  with  Michael  Harrington,  Ramon  Konewko,  R.N.,  and  Paul 
Nagy,  Ph.D.,  for  guidance  and  assistance. 
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The  surgery  scheduling  process  has  been  carefully  studied  to  understand  the 
different  organizations  and  persons  who  participate  in  the  process,  including  the 
schedulers  in  the  surgical  services,  the  perioperative  services  office,  the  PACU  manager, 
and  the  OR  charge  nurse.  Interviews  with  many  of  these  groups  and  observations  of  their 
scheduling  process  were  conducted  on  January  17,  2008.  These  groups  also  provided 
copies  of  their  scheduling  policies  and  typical  schedules. 

We  are  currently  developing  a  mathematical  model  for  evaluating  congestion  in 
postoperative  units,  including  ICUs,  IMCs,  and  floor  units.  This  model  will  require  data 
about  post-operative  destinations  and  length-of-stay  distributions  for  different  types  of 
surgeries.  Data  about  cardiac  surgeries  from  two  years  were  analyzed  to  develop  a 
methodology  for  computing  the  needed  information.  This  methodology  was  applied  to  a 
larger  set  of  historical  data  to  generate  a  complete  set  of  information  to  execute  the 
congestion  evaluation  model. 

Informatics  subgroup  2.  Operating  Room  Glitch  Analysis  (OGA) 

The  OGA  project,  focusing  on  institutional  learning,  examined  the  workflow 
around  performance  indicators  in  the  peri-operative  environment  and  building  a  graphical 
dashboard  to  allow  data  mining  and  trend  analysis  of  operating  indicators.  The 
dashboard  platform  was  built  in  a  tiered  architecture  consisting  of  information  extraction, 
data  warehousing  and  manipulation,  and  user  interface  layers.  The  system  has  clearly 
segmented  application  components  and  a  modular  programming  code  base.  Combined 
with  web  and  programming  standards,  the  tiered  architecture  has  kept  the  software 
extensible  despite  its  complexity. 

To  combat  the  complexity  introduced  by  the  disparate  clinical  information 
systems  (surgery,  anesthesia,  patient  tracking,  facilities,  quality)  used  throughout  the  peri¬ 
operative  environment  the  information  extraction  layer  uses  an  abstracted  interface  based 
on  RESTful  web  services  to  receive  information  provided  by  any  number  of  systems. 
Initial  extraction,  transformation,  and  loading  (ETL)  work  have  been  done  for  the 
surgical  case  data.  Case  information  is  pulled  from  the  clinical  database  once  per  day  and 
loaded  into  the  data  warehouse.  Though  the  current  information  feed  is  poling  the  clinical 
system  once  per  day,  the  data  warehouse  itself  will  allow  for  clinical  integration  with  the 
data  warehouse  at  transactional  speed,  as  real  time  information  is  made  available. 

The  data  warehousing  and  manipulation  layer  contains  a  relational  database 
(MySQL)  in  which  information  from  the  receiver  layer  is  stored.  A  business  intelligence 
rules  engine,  written  in  erlang,  provides  the  means  to  aggregate  low-level  information 
into  key  performance  indicators  (KPI).  The  semantic  linking  between  the  data  is  also 
stored  to  provide  access  to  each  level  of  information  that  powers  the  aggregated  KPI. 

The  KPIs  are  generated  by  a  data  analyst  through  a  web  interface.  When  a  new  KPI  is 
defined  by  the  analyst  the  data  is  immediately  aggregated  and  kept  up  to  date  in  real  time 
without  the  need  for  user  interaction. 

The  user  interface  contains  three  major  components. 

1 .  A  data  manipulation  layer  that  allows  interaction  with  the  data  warehouse  and 
provides  analysts  with  means  to  create  and  track  new  metrics. 

2.  A  visualization  toolkit  to  create  graphs  and  web  pages  to  display  information 
effectively.  This  includes  the  means  to  create  clickable  and  animated  graphs. 
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3.  Semantic  services  which  generate  data  drilling  functionality  for  graphs  and  data 
tables  without  the  need  for  extra  programming  or  user  configuration.  These 
semantic  services  also  enable  filtering  of  information  and  the  ability  to  view 
related  metrics  in  the  same  filtered  context  simultaneously. 

Completed  stages  of  the  project: 

•  A  pipeline  for  case  information  from  Surginet 

•  Business  intelligence  engine  and  data  warehouse  to  handle  aggregation, 
manipulation,  and  storage  of  data 

•  Web  interface  to  interact  with  and  aggregate  data  into  KPI’s 

•  Web  interface  to  build  data  visualizations 

•  Web  Interface  to  build  dashboard  pages 

•  Semantic  services 

Areas  of  future  improvement: 

•  Observation  engine  to  evoke  actions  on  specified  events  within  the  data  pipeline 

•  Web  interface  to  add/edit/delete  actions  and  events  in  the  observation  engine 

•  Formalized  evaluation  of  the  scalability  of  the  warehouse 

•  Improved  usability  for  the  web  interface 

•  More  advanced  visualizations  for  user  creation 

•  Forms  management  of  quality  metrics 

Informatics  subgroup  3.  Context  Aware  Surgical  Training  (CAST) 

We  proposed  to  design  and  implement  a  prototype  context  aware  surgical  training 
environment  (CAST)  as  part  of  the  University  of  Maryland  Medical  System’s  SimCenter. 
This  system  would  be  used  to  explore  the  role  that  an  intelligent  pervasive  computing 
environment  can  play  to  enhance  the  training  of  surgical  students,  residents  and 
specialists.  The  research  built  on  prior  work  on  context  aware  “smart  spaces”  done 
at  UMBC;  leveraging  our  experience  in  working  with  RFID  in  the  DARPA  Trauma  Pod 
program  as  well  as  in  incorporating  Web-based  infrastructure  and  software  applications 
in  academic  and  professional  development  programs.  The  project  resulted  in  a  pilot 
system  integrating  one  or  two  training  resources  available  in  the  SimCenter  into  a  context 
aware  training  environment  that  could  recognize  the  presence  of  a  trainee  and  or  mentor 
and  take  appropriate  action  based  on  known  training  goals  and  parameters.  The  project 
will  advance  the  knowledge  of  context  aware  training  environments  in  a  highly  technical 
medical  field  and  provide  a  basis  for  incorporating  more  advanced  technology  assisted 
learning  experiences  in  medicine.  This  “smart  environment”  may  then,  if  successful,  be 
scaled  to  meet  the  needs  of  an  operative  environment  where  the  technological  demands 
may  be  the  similar  or  analogous  to  those  seen  in  the  training  environment.  A  goal  of  this 
project  was  to  advance  interactive  information,  resource,  and  content  management  via  a 
seamless  process.  Ultimately,  the  advanced  training  and  potential  for  use  in  perioperative 
environments  have  a  long-term  end  goal  of  improving  patient  safety  and  adding  to  the 
body  of  knowledge  in  surgical  training.  Initially,  we  saw  a  situation  were  clinicians  in 
training  can  receive  a  tailored  curriculum.  Additionally,  we  envision  a  system  that  offers 
real-time  feedback  and  decision  support  and  education  metrics  to  faculty. 

A  key  goal  this  year  was  to  prototype  the  CAST  system.  Initially,  we  met  with  the 
MASTRI  team  responsible  for  the  training  efforts  to  iron  out  the  requirements  for  the 
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system  and  came  up  with  the  following  set  of  tasks  to  be  accomplished;  Student  Tracking, 
Enforcing  Prerequisites,  Video  Capture  and  Instructor  Feedback. 

Also,  we  defined  a  typical  use  case  for  our  system. 

A  student  enters  the  simulation  center.  The  system  identifies  the  student  (for  instance, 
using  their  Bluetooth  phone  or  their  badge),  and  does  a  prerequisite  check  based  on  the 
simulator  the  student  wants  to  perform  the  procedure.  Only  if  the  student  is  done  with  the 
prerequisites,  is  he/she  allowed  to  proceed.  When  the  student  indicates  that  they  are  ready 
to  begin,  the  system  starts  capturing  the  external  and  internal  view  until  the  student 
indicates  that  they  have  completed  the  task.  The  captured  video  is  then  transferred  to  the 
video  server  for  review  by  the  instructor.  The  instructor  interface  allows  the  instructor  to 
see  the  entry  logs  of  students  in  terms  of  when  they  entered  and  exited  the  centre  along 
with  the  corresponding  external  view. 

We  employed  the  spiral  prototyping  approach  as  an  experimental  test  bed;  we 
designed  and  implemented  an  initial  system  prototype  that  would  meet  the  above 
functional  requirements.  The  prototype  integrates  two  machines  with  each  simulator  —  a 
small  Nokia  800  device  for  resident  interaction,  and  a  larger  PC  for  video  capture.  Note 
that  this  is  for  the  proof  of  concept.  A  single  small  form  factor  but  computationally 
powerful  machine  could  be  used  instead.  In  fact,  for  simulators  such  as  the  VR,  we 
expect  that  eventually  manufacturers  could  integrate  our  system  directly  into  the 
computer  that  drives  the  simulation. 

Our  prototype  used  Bluetooth  for  localization  of  residents  in  the  simulation  centre. 
It  was  designed  to  be  modular,  so  that  any  other  technology  (such  as  resident  ID  cards) 
could  be  integrated  easily.  We  also  hosted  training  materials  including  videos  for  FLS, 
Kentucky  and  Rosser  tasks  in  our  system,  and  tracked  student  progress  through  the 
chapters  checked  out.  This  was  used  for  enforcing  prerequisites  when  students  entered 
the  simulation  centre  to  perform  procedures.  In  addition  to  enforcing  prerequisites,  there 
was  a  need  for  the  instructors  to  visually  see  what  the  residents  were  doing  during  their 
simulation  procedures.  We  use  N800’s  built  in  camera  to  capture  the  residents’  external 
views.  These  video  feeds  are  then  fed  into  a  central  server  for  review  by  the  instructor. 

For  location  detection,  we  also  experimented  with  using  the  Awarepoint  tags.  Awarepoint 
uses  a  zigbee  based  mesh  network  for  localization  and  exposes  the  location  information 
through  a  web  service.  Our  experiments  indicated  that  Awarepoint  could  provide  us  room 
level  information,  but  not  anything  finer.  While  this  would  help  identify  if  the  residents 
were  in  the  simulation  center,  it  would  not  help  determine  which  machine  they  were 
using,  which  was  needed  for  CAST.  We  demonstrated  our  first  system  prototype  at  the 
ORF  workshop  by  going  through  a  typical  student  workflow. 

Based  on  feedback  on  individual  components  of  the  first  prototype,  we  started  the 
secondversion  of  the  prototype  to  be  deployed  at  the  MASTRI  center.  The  key  changes  in 
thesecond  prototype  from  the  first  one  are  described  below. 

•  We  no  longer  use  awarepoint  for  locationing  since  it  could  provide  us  only  room 
level  accuracy. 

•  On  the  student  identification  front,  we  are  using  a  standard  username  and 
password  method  for  now.  Also,  since  we  have  external  camera  views  from  the 
N800,  students  identified  can  be  verified  during  the  review  process  by  the 
instructor.  Bluetooth  based  identification  exists,  but  is  not  used  since  we  were  told 
that  most  residents  may  not  have  phones  with  Bluetooth.  Multiple  candidate 
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technologies  for  identification  such  as  Bluetooth,  RFID,  nearfield  RF  badges  etc. 
have  emerged  and  been  discussed  with  MASTRI  staff.  No  single  choice  has  been 
made  yet  -  the  idea  is  to  first  make  the  system  robust  from  a  use  perspective,  and 
then  integrate  identification  technologies  based  on  further  discussion  with 
MASTRI  staff.  Our  system  is  capable  and  flexible  enough  to  handle  a  variety  of 
lower  level  locationing  technologies  and  therefore  we  would  choose  the  one  that 
is  most  practical  in  MASTRI  scenario. 

•  Due  to  hospital  network  firewall  policies,  we  had  to  move  away  from  using  a 
wireless  network  for  transferring  videos  from  the  N800  to  the  MASTRI  video 
server.  We  currently  achieve  this  by  tunneling  through  the  internal  view  capturing 
machine  which  is  hooked  to  the  N800  by  a  USB  cable. 

•  Prerequisite  checks  are  temporarily  suspended  since  the  initial  classes  being 
taught  in  MASTRI  are  not  following  FLS. 

We  also  focused  on  moving  the  system  from  UMBC  machines  to  the  MASTRI 
infrastructure  where  they  will  be  housed.  We  purchased  a  small  factor  Dell  machine  to  be 
used  for  capturing  internal  views  from  simulators.  Storage  was  purchased  and  added  to 
the  mastri-intemal  server  for  archiving  both  internal  and  external  video  feeds  with  help 
from  a  computer  scientist.  Also,  we  have 

•  Integrated  the  student  database  from  the  hospital 

•  Hosted  FLS  and  other  training  videos  on  the  hospital  infrastructure 

•  Hacked  internal  views  of  the  simulators 

We  have  now  developed  the  system  to  capture  internal  video  feeds  and  metrics  from  the 
following  simulators 

•  Promis 

•  Stryker 

•  Laproscopic  VR  simulator 

We  use  external  s-video  frame  grabbers  to  capture  the  simulator  internal  video  feeds. 
These  feeds  are  synchronized  with  the  external  view  from  the  N800  and  stored  on  the 
video  server.  Thus,  the  instructor  now  has  access  to  both  the  internal  and  external  feeds 
during  review,  and  consequently  they  can  provide  better  feedback.  Currently  our  system 
uses  email  to  send  back  feedback. 

Web  Interface: 

We  have  integrated  the  student  database  into  our  web-based  curriculum  management 
system.  The  student  database  contains  all  the  current  residents  and  one  guest  account  that 
can  be  used  for  testing.  If  a  student  wants  to  view  the  training  videos,  he  or  she  will  first 
need  to  log  in  using  their  SMail  Userid.  When  they  pass  the  authentication,  a  categorized 
web  structure  will  be  displayed,  and  they  can  choose  to  sort  the  tasks  by  category,  by 
difficulty  (FLS  integrated),  or  by  each  (Basic,  Instrument,  Procedural  Skills,  and  FLS), 
which  is  shown  in  the  following  screenshot.  This  structure  was  developed  in  consultation 
with  the  MASTRI  team,  particularly  Ivan  George  and  Ethan  Hagan. 

Then  the  student  can  pick  any  training  video  they  want  to  view.  Suppose  the 
student  wanted  to  view  the  bagging  skill,  and  then  it  will  display  the  following  screen. 

As  for  the  instructor  interface,  the  instructor  first  needs  to  pass  authentication  to  access 
the  student  training  records.  Then,  they  can  pick  the  student  name  that  they  want  to  view 
from  a  drag-down  list  that  contains  all  the  residents.  The  appropriate  student  record  will 
pop  up  in  the  next  page,  which  contains  the  following  information:  the  chapters  that  the 
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student  has  checked  out,  the  student  training  history  (simulator  type,  start  time,  end  time, 
internal  video  record,  external  video  record),  and  the  instructor  can  provide  their 
feedback  for  every  training  record  of  this  student  via  email. 

At  the  end  of  this  period  of  performance,  research  on  the  CAST  system  was 
judged  to  have  reached  its  maximum  potential.  Advances  in  related  technology  and  off- 
the-shelf  equipment  facilitated  the  accomplishment  of  the  broad  goals  of  the  project. 
Going  forward,  the  team  of  computer  scientists,  faculty  and  students  of  the  University  of 
Maryland  -  Baltimore,  were  redirected  into  a  project  of  vital  importance  to  the  surgical 
environment,  that  of  enhancing  video  summarization  of  surgical  cases. 

Informatics  subgroup  3  (revised)  Video  Summarization 
Background 

The  research  endeavor  is  a  recent  inclusion  to  the  Innovations  in  the  Surgical 
Environment  program.  Thus,  the  initial  description  of  our  work  will  include  a  meta¬ 
analysis  of  sorts,  acquainting  the  reader  with  the  technology  and  technical  developments 
in  the  arena  of  video  summarization.  Within  this  section  of  the  report,  references  to  the 
literature  are  inserted  apart  from  the  overall  bibliography  of  work  conducted  under  the 
auspices  of  the  Innovations  in  the  Surgical  Environment  research  program. 

Laparoscopic  surgery  is  a  minimally  invasive  technique.  It  was  first  performed  in 
1987,  and  is  now  the  method  of  choice  for  a  number  of  surgical  procedures.  The 
laparoscopic  approach  involves  the  user  of  narrow  tubes,  called  trocars,  which  are 
inserted  into  the  abdomen  through  small  incisions.  A  camera  is  passed  through  one  of  the 
trocars  to  visualize  the  surgical  field.  Instruments  are  passed  through  the  other  trocars  to 
cut,  manipulate,  and  sew. 

Compared  with  an  open  procedure,  patients  who  undergo  laparoscopic  surgery 
have  smaller  scars,  reduced  pain,  and  a  quicker  recover.  There  are,  however,  a  number  of 
technical  challenges  with  the  laparoscopic  approach.  Access  is  limited  to  small  incisions 
through  which  long  instruments  are  passed.  Tactile  feedback  is  reduced.  Visualization  is 
restricted  to  two-dimensional  video  [1]. 

Because  of  these  technical  challenges,  the  traditional  apprenticeship  model  is  not 
sufficient  for  trainees  to  develop  laparoscopic  skills.  Additional  methods  are  used  to 
develop  competency,  such  as  box  trainers,  virtual  reality  simulators,  and  video-based 
assessment  [2], 

Our  overall  goal  is  to  develop  a  software  tool  to  assist  with  video-based 
assessment.  Such  a  tool  would  automatically  divide  each  video  into  the  eleven  basic  steps 
of  the  cholecystectomy  (360-degree  surveillance,  trocar  placement,  preliminary 
dissection,  and  so  on)  and  provide  a  set  of  tools  to  efficiently  review  each  video  segment. 
It  would  allow  an  evaluator  to  navigate  to  any  section  of  the  surgical  procedure,  skip  to  a 
particular  section,  alter  the  viewing  order,  spend  more  time  on  critical  sections,  or  view 
the  same  section  of  a  procedure  from  multiple  trainees.  See  Figure  1 . 
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Segmented  Video 


Video  of  Procedure 
is  Created 


■  Alter  the  viewing  order 

■  View  the  same  section 
of  a  procedure  from 
multiple  trainees. 


Figure  1.  Proposed  laparoscopic  video  segmentation  and  review  scheme 


This  report  presents  an  initial  feasibility  study.  We  used  image  classification 
techniques  and  distance  metrics  to  identify  the  critical  view  of  a  laparoscopic 
cholecystectomy  (surgical  procedures  to  remove  the  gall  bladder).  The  critical  view  is  the 
point  in  the  surgery  when  the  essential  anatomy  has  been  identified,  and  is  an  important 
validation  step  before  clipping  the  cystic  duct  and  artery.  See  Figure  2. 


Figure  2.  Critical  view  image. 
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Spectral  and  textural  features  can  both  be  used  for  image  analysis.  Spectral 
features  describe  the  tonal  or  color  variations  in  an  image.  Histograms  are  used  to 
represent  the  distribution  of  tones  in  an  image.  Textural  features  describe  the  spatial 
distribution  of  tonal  variations,  and  are  often  represented  as  a  gray-level  co-occurrence 
matrix  (GLCM)  [3].  Related  efforts  include  the  use  of  color  histograms  to  segment 
hysteroscopy  video  [4],  color  histograms  and  state  information  to  segment 
echocardiogram  video  [5],  and  the  use  of  texture  feature  to  analyze  images  from  and 
breast  tissue  histopathology  [6],  prostate  cancer  [7],  and  capsule  endoscopy  [8], 

Methods 

Our  objective  was  to  compare  image  features  using  distance  metrics,  in  an  attempt 
to  identify  the  critical  view  of  a  laparoscopic  cholecystectomy.  This  was  to  determine 
feasibility  of  recognizing  other  steps  in  a  laparoscopic  cholecystectomy  for  the  purpose  of 
surgical  video  segmentation. 

Five  laparoscopic  cholecystectomy  videos  were  provided  to  us  by  the  University 
of  Maryland  School  of  Medicine,  which  contained  24-bit  color  video  at  29  frames  per 
second.  The  videos  were  reviewed  by  a  surgeon  to  determine  the  timing  of  the  critical 
view.  We  used  FFmpeg  to  convert  the  video  to  a  sequence  of  24-bit  color  JPEG  images 
(http://ffmpeg.mplayerhq.hu/).  We  used  ImageJ  for  to  extract  image  features  features 
(http://rsbweb.nih.gov/ij/).  The  distance  metric  we  used  to  compare  these  results  was  the 
Jeffrey  Divergence  [9].  See  Figure  3. 

We  applied  the  Jeffrey  Divergence  to  the  data  5  times,  each  time  using  a  critical 
view  image  from  a  different  case  as  our  basis  for  comparison.  If  a  comparison  using  the 
Jeffrey  Divergence  was  below  a  given  threshold,  it  was  assume  to  be  a  critical  view 
image.  We  chose  threshold  values  empirically  to  maximize  both  sensitivity  and 
specificity. 


Image  Extraction 
FFmpeg 


Random 

Images 


Feature  Extraction 
I  magej 


Known  Critical 
View  Images 


Color  Features 
Textural  Features 


Distance  Metric 


>  f 


Critical  View? 


Figure  3.  Methodology. 


Results 

A  summary  of  the  images  use  are  shown  in  Table  1.  The  data  included  378 
representative  images  taken  randomly  throughout  the  5  laparoscopic  cholecystectomies, 
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with  104  of  the  images  collected  near  the  critical  view.  We  analyzed  49  separate  spectral 
and  textural  features,  the  most  promising  of  which  are  shown  below.  They  included  4 
textural  features:  energy  (uniformity),  entropy  (complexity),  contrast  (local  variation), 
and  correlation  (linear  patterns).  They  also  included  4  spectral  features:  3D  histogram 
(color  distribution  in  the  3  dimensional  space),  color  histogram  (distribution  of  each  color 
individually),  and  binary  versions  of  the  3D  and  color  histograms. 


Feature 

Sensitivity 

Specificity 

Energy 

66.4% 

67.2% 

Entropy 

60.9% 

62.5% 

Contrast 

62.4% 

62.6% 

Correlation 

62.2% 

60.0% 

3D  Histogram 

64.0% 

64.8% 

Color  Histogram 

71.5% 

72.2% 

Binary  3D  Histogram 

61.2% 

59.5% 

Binary  Color  Histogram 

66.0% 

66.1% 

Table  1.  Sensitivity  and  specificity. 


Discussion 

Our  initial  results  show  promise  with  a  sensitivity  and  specificity  up  to  72%. 
Energy,  which  is  a  textural  measure  of  pixel  similarity,  and  color  histogram,  which  is  a 
measure  of  color  distribution  performed  the  best.  Accuracy,  however,  must  improve 
before  there  can  by  any  practical  application  of  this  approach. 

The  image  analysis  process  was  very  data  intensive.  At  29  frames  per  second, 
roughly  100,000  images  were  produced  from  every  hour  of  video.  Each  3D  histogram 
generated  about  17,000,000  pieces  of  data  per  image.  Because  of  this,  we  quantized  this 
color  distribution  into  a  more  manageable  size,  which  is  probably  why  it  did  not  perform 
as  well. 

When  interpreting  our  results  it  is  important  to  consider  several  limitations.  The 
study  was  small  in  size.  The  cases  were  restricted  to  a  single  academic  medical  center. 
Finally,  our  comparisons  were  limited  to  one  feature  from  one  image  at  a  time. 

We  are  currently  working  on  a  more  robust  image  classifier,  based  on  the  lessons 
learned  from  this  research.  We  are  using  particle  analysis  and  edge  analysis  [10]  to 
identify  the  characteristics  of  the  major  objects  in  an  image.  We  are  also  experimenting 
with  support  vector  machines,  which  are  supervised  learning  methods  that  use  multiple 
image  features  to  classify  images  using  and  n-dimensional  hyperplane  [11].  Other 
features  being  considered  are  the  use  of  temporal  information,  logical  workflow,  and 
relevant  clinical  data  to  increase  the  accuracy  of  our  image  classifier. 

In  summary,  we  compared  image  features  with  a  distance  metric  to  identify  the 
critical  view  of  a  laparoscopic  cholecystectomy.  Our  initial  results  were  promising,  but 
more  work  needs  to  be  done  to  increase  accuracy.  We  are  currently  experimenting  with 
particle  analysis,  edge  analysis,  and  support  vector  machines  as  ways  to  create  a  more 
robust  image  classifier. 
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Informatics  subgroup  4.  Operating  Room  Clutter  (ORC) 

The  project  team  has  worked  on  the  use  of  advanced  video  technology  to  support 
coordination  in  operating  rooms.  Our  activities  were  in  four  areas  indicated  below,  each 
of  which  contains  a  summary  of  publications.  For  this  portfolio  of  the  Informatics  pillar 
of  Innovations  in  the  Surgical  Environment,  the  measure  of  our  success  is  depicted  in  the 
number  and  nature  of  professional  publications.  Like  other  pillars  of  innovation,  there  is 
a  ground-breaking  aspect  to  the  work.  We  perceive  a  scientific  responsibility  to  gather 
and  disseminate  information  as  quickly  and  widely  as  possible.  Thus,  our  report  refers 
the  reader  to  specific  publications  of  work  conducted  in  each  research  activity.  All 
publications  referred  to  may  be  found  in  our  website:  http://hfrp.umaryland.edu.  For  full 
length  journal  articles,  PDF  files  may  be  downloaded.  For  others,  abstracts  are  available. 
In  all,  we  published  8  full-length  peer  reviewed  journal  articles,  2  full-length  peer 
reviewed  proceeding  articles,  and  8  conference  abstracts.  The  references  below  can 
provide  further  details. 

A.  Models  of  decision  making  for  operating  room  management. 

We  reviewed  literature  and  developed  a  synthesis  report  on  the  state  of  the  art  of 
decisions  on  the  day  of  surgery.  Furthermore,  we  developed  models  for  decision 
support  systems  for  operating  room  management.  The  activities  in  this  area  were 
reported  in  the  following  publication: 

1.  Dexter  F,  Xiao  Y,  Dow  AJ,  Strader  MM,  Ho  D,  Wachtel  RE.  Coordination  of 
Appointments  for  Anesthesia  Care  Outside  of  Operating  Rooms  Using  an 
Enterprise  Wide  Scheduling  System.  Anesthesia  and  Analgesia.  105:1701-1710. 

2007 
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2.  Xiao  Y,  Strader  M,  Hu  P,  Wasei  M,  Wieringa  P.  Visualization  Techniques  for 
Collaborative  Trajectory  Management .  ACM  Conference  on  Human  Factors  in 
Computing  Systems,  pp.1547  -  1552.  2006 

3.  Xiao  Y,  Wasei  M,  Hu  P,  Wieringa  P,  Dexter  F.  Dynamic  Management  in 
Perioperative  Processes:  A  Modeling  and  Visualization  Paradigm.  12th  IFAC 
Symposium  on  Information  Control  Problems  in  Manufacturing.  (3)647-52.  2006 

4.  Dutton  R,  Ho  D,  Hu  P,  Mackenzie  CF,  Xiao  Y.  Decision  Making  by  Operating 
Room  Managers:  The  Burden  of  Changes.  Anesthesiology,  103:A1175.  2005 

5.  Dexter  F,  Epstein  RH,  Traub  RD,  &  Xiao  Y.  Making  Management  Decisions  on 
the  Day  of  Surgery  Based  on  Operating  Room  Efficiency  and  Patient  Waiting 
Times.  Anesthesiology,  101(6):1444-1453.  2004 

B.  Operating  room  multimedia  system  design  and  methodology. 

We  developed  technology,  primarily  based  on  algorithms  of  video  processing  and 
biosignal  processing,  to  display  status  of  operating  rooms.  The  displays  are  to 
inccfrease  situational  awareness.  The  technological  advances  made  by  our  group 
were  reported  in  the  following  publications: 

6.  Xiao  Y,  Schimpff  S,  Mackenzie  CF,  Merrell  R,  Entin  E,  Voigt  R,  Jarrell  B.  Video 
Technology  to  Advance  Safety  in  the  Operating  Room  and  Perioperative 
Environment.  Surgical  Innovation.  14(1):  52-61.  2007 

7.  Hu  P,  Xiao  Y,  Ho  D,  Mackenzie  CF,  Hu  H,  Voigt  R,  Martz  D.  Advanced 
Visualization  Platform  for  Surgical  Operating  Room  Coordination:  Distributed 
Video  Board  System.  Surgical  Innovation.  13(2): 129-135.  2006 

8.  Hu  P,  Seagull  FJ,  Mackenzie  CF,  Seebode  S,  Brooks  T,  XiaoY.  Techniques  for 
Ensuring  Privacy  in  Real-Time  and  Retrospective  Use  of  Video.  Telemedicine 
and  e-Health,  12(2):  204,  T1E1.  2006 

9.  Xiao  Y,  Hu  P,  Hu  H,  Ho  D,  Dexter  F,  Mackenzie  CF,  Seagull  FJ,  Dutton  D.  An 
algorithm  for  processing  vital  sign  monitoring  data  to  remotely  identify  operating 
room  occupancy  in  real-time.  Anesthesia  &  Analgesia,  (10 1)3:82 3-829 . 2005 

10.  Hu  PF,  Burlbaugh  M,  Xiao  Y,  Mackenzie  CF,  Voigt  R,  Brooks  T,  Fraser  L, 

Connolly  MR,  Herring  T.  Video  Infrastructure  and  Application  Design  Methods 
for  an  OR  of  the  Future.  Telemedicine  and  e-Health.  11(2),  211,  T3C2.  2005 

11.  Hu  PF,  Hu  H,  Seagull  JF,  Mackenzie  CF,  Voigt  R,  Martz  D,  Dutton  R,  Xiao  Y. 
Distributed  Video  Board:  Advanced  Telecommunication  System  for  Opearation 
Room  Coordination.  Telemedicine  and  e-Health.  11(2),  248,  P28.  2005 

12.  Hu  PF,  Xiao  Y,  Mackenzie  CF,  Seagull  FJ,  Brooks  T,  LaMonte  MP,  &  Gagliano 
D.  Many  to  One  to  Many  Telemedicine  Architecture  and  Applications. 

Telemedicine  Journal  and  e-Health.  10 (Supplement  1),  S-39.  2004 

C.  Survey  and  descriptive  studies  of  operating  room  management,  with  and 
without  the  support  of  advanced  video  technology. 

In  conjunction  with  technology  development,  we  conducted  observational  and 
survey  studies  of  operating  room  management.  These  studies  and  associated 
results  were  in  the  following  publications: 

13.  Seagull  FJ,  Xiao  Y,  &  Plasters  C.  Information  Accuracy  and  Sampling  Effort:  A 
Field  Study  of  Surgical  Scheduling  Coordination.  IEEE  Transactions  on  Systems, 
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Man,  and  Cybernetics,  Part  A: Systems  and  Humans.  24(6),  764-771.  2004 

14.  Dutton  R,  Hu  PF,  Mackenzie  CF,  Seebode  S,  Xiao  Y.  A  Continuous  Video 
Buffering  System  for  Recording  Unscheduled  Medical  Procedures. 

Anesthesiology,  103:A1241.  2005 

15.  Gilbert  TB,  Hu  PF,  Martz  DG,  Jacobs  J,  Xiao  Y.  Utilization  of  Status  Monitoring 
Video  for  OR  Management.  Anesthesiology,  103:A1263.  2005 

16.  Dutton  R,  Hu  P,  Seagull  FJ,  Scalea  T,  Xiao  Y, .  Video  for  Operating  Room 
Coordination:  Will  the  Staff  Accept  It?.  Anesthesiology:  101:  A 1389.  2004 

D.  Technology  evaluation. 

We  conducted  evaluation  studies  of  the  technology  deployed.  The  primary  focus 
was  on  user  acceptance  and  usage  patterns.  The  focus  was  chosen  because  the 
current  science  of  operating  room  management  has  concluded  that  improvement 
of  decision  making  on  the  day  of  surgery  will  lead  to  improvement  in  intangible 
outcomes,  such  as  situation  awareness,  and  will  unlikely  lead  to  improvement  in 
operating  room  throughput  (e.g.,  volumes  and  economic  returns).  Our  work  was 
reported  in  the  following  publication. 

17.  Xiao  Y,  Dexter  F,  Hu  FP,  Dutton  R.  Usage  of  Distributed  Displays  of  Operating 
Room  Video  when  Real-Time  Occupancy  Status  was  Available  .  Anesthesia  and 
Analgesia  2008;  106(2):554-560.  2008 

18.  Kim  Y-J,  Xiao  Y,  Hu  P,  Dutton  RP.  Staff  Acceptance  of  Video  Monitoring  for 
Coordination:  A  Video  System  to  Support  Perioperative  Situation  Awareness. 

Journal  of  Clinical  Nursing  (accepted).  2007 

Informatics  subgroup  5.  Improving  Perioperative  Communications: 

During  the  period  of  performance  of  this  project,  multiple  versions  of  work  teams 
were  tried  in  an  effort  to  get  the  right  people  to  the  task.  A  correct  mix  of  expertise  was 
determined  including  representatives  of  Perioperative  scheduling,  Administration,  IT 
Support,  as  well  as  team  members  from  surgery,  anesthesiology,  and  other  OR  services. 

Data  Analysis 

The  team  has  actively  performed  a  re-review/study  of  the  performance  metrics 
available  from  various  Perioperative  data  stores.  In  particular,  near  past  analysis  has  been 
benefited  by  work  performed  via  the  OGA  team. 

Background: 

In  the  UMMS  OR,  the  Cardiac  Surgery  Service  utilizes  a  common 
communications  point  (a  “cardiac  phone  line”)  that  in  a  sense  is  used  to  acquire 
information  and  provide  that  information  to  any  team  member  who  calls  the  line  to 
acquire  information.  The  cardiac  phone  line  has  been  scripted  and  is  actively  in  use 
through  a  voice  mail  system.  It  can  only  be  altered  by  dedicated  personnel  with  password 
capability.  The  script  involves  the  following  standardized  information:  Identification  of 
individual  providing  information,  the  Date  of  surgery,  the  Total  number  of  cases,  and  OR 
location,  patient  name,  case  order,  medical  record  number,  age,  surgeon,  anesthesiologist 
and  procedure.  Evening  schedule  updates  have  been  made  possible  through  a  second 
phone  line  option. 
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Problem  Statement: 

After  some  effort,  we  can  now  move  to  track  updates  on  the  phone  line  and 
correlate  these  updates  with  OR  start  delays.  Thus,  we  refined  the  IPC  question  to  Does 
more  accurate  information  as  evidenced  by  updates  on  the  phone  line,  ie  improved 
communication,  result  in  fewer  problems  in  the  morning  with  cardiac  surgical  cases 
starting  on  time-  are  instruments  better  prepared  for  the  procedure,  are  operating  rooms 
better  equipped  for  the  appropriate  case,  are  the  correct  pick  lists  utilized  for  the  correct 
surgeon,  is  there  less  of  a  transport  delay  because  the  patient’s  hospital  location  has  been 
identified?  The  question  contains  reference  to  some  of  the  delay  codes  that  are  currently 
utilized  by  the  Operating  room  tracking  system  and  reported  for  glitch  analysis. 

With  the  assistance  of  the  communications  personnel  we  will  reconfigure  the  cardiac 
phone  line  so  that  we  can  actually  track  the  phone  calls  made  to  the  phone  line.  This  will 
enable  us  to:  Determine  key  personnel  who  are  utilizing  the  phone  line,  Determine 
groups  of  personnel  utilizing  the  phone  line  (i.e.  nursing,  anesthesia,  perfusion), 
Determine  which  groups  are  not  utilizing  phone  line  information  (i.e.  anesthesia  techs), 
Determine  whether  there  is  a  time  variable;  is  there  a  better  time  to  call  for  updates? 
Should  updates  be  made  at  predetermined  times  or  should  they  be  more  dynamic? 

We  hypothesize  that  information  gained  from  increased  communication  improves  OR 
efficiency.  If  this  is  the  case  we  can  them  move  to  see  if  more  real-time  enabling 
technologies  might  be  deployed  to  other  services  within  the  UM  ORs  and  perhaps  other 
ORs  “everywhere”. 

The  IPC  project  progressed  to  identify  and  utilize  new  technologies  (Cell,  WiFi, 
IM,  Web  fusion  technologies)  being  developed  in  the  UM  Radiology  department.  This 
simple  phone  line  will  establish  a  form  of  communication  that  is  more  mobile,  accurate, 
up-to-date,  and  shares  a  common  lexicon. 

B.  Simulation 

This  report  introduces  the  basics  of  the  Maryland  Virtual  Patient  simulation 
approach,  discusses  its  place  on  the  map  of  intelligent  systems  in  clinical  medicine  and 
describes  the  project’s  status  and  research  and  development  activity  presently  under  way. 
The  work  on  the  Maryland  Virtual  Patient  project  will  continue  after  termination  of  the 
current  contract.  Additional  funds  from  government  agencies  will  be  sought  to  sustain 
the  effort  on  this  important  contribution  to  simulation  training:  that  is,  the  only  known 
cognitive  simulation  training  system. 

Simulation:  The  Maryland  Virtual  Patient 

We  present  here  a  simplified  description  of  the  MVP  simulation,  interaction  and 
tutoring  system.  A  virtual  patient  instance  is  launched  and  starts  its  simulated  life,  with 
one  or  more  diseases  progressing.  When  the  virtual  patient  develops  a  certain  level  of 
symptoms,  it  presents  to  the  attending  physician,  the  system’s  user.l  The  user  can  carry 
out,  in  an  order  of  his  or  her  choice,  a  range  of  actions:  interview  the  patient,  order 
diagnostic  tests,  order  treatments,  and  schedule  the  patient  for  follow-up  visits.  The 
patient  can  also  automatically  initiate  follow-up  visits  if  its  symptoms  reach  a  certain 
level  before  a  scheduled  follow-up.  This  patient-physician  interaction  can  continue  as 
long  as  the  patient  “lives.” 
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As  of  the  time  of  writing,  the  implemented  MVP  system  includes  a  realization  of 
all  of  the  above  functionalities,  though  a  number  of  means  of  realization  are  temporary 
placeholders  for  more  sophisticated  solutions,  currently  under  development.  The  most 
obvious  of  the  temporary  solutions  is  the  use  of  menu-based  patient-user  interaction 
instead  of  natural  language  interaction.  While  this  compromise  is  somewhat  unnatural  for 
our  group,  which  has  spent  the  past  20  years  working  on  knowledge-based  NLP,  it  has 
proved  useful  in  permitting  us  to  focus  attention  on  the  non-trivial  core  modeling  and 
simulation  issues  that  form  the  backbone  of  the  MVP  system. 

MVP  currently  covers  six  esophageal  diseases  pertinent  to  clinical  medicine: 
achalasia,  gastroesophageal  reflux  disease  (GERD),  laryngopharyngeal  extraesophageal 
reflux  disease  (LERD),  LERD-GERD  (a  combination  of  LERD  and  GERD),  scleroderma 
esophagus  and  Zenker’s  diverticulum.  At  the  beginning  of  a  simulation  session,  the 
system  presents  the  user  with  a  virtual  patient  about  whose  diagnosis  he  initially  has  no 
knowledge.  The  user  then  attempts  to  manage  the  patient  by  conducting  office  interviews, 
ordering  diagnostic  tests  and  prescribing  treatments.  Answers  to  user  questions  and 
results  of  tests  are  stored  in  the  user’s  copy  of  the  patient  profile,  represented  as  a  patient 
chart.  At  the  beginning  of  the  session,  the  chart  is  empty  and  the  user’s  cognitive  model 
of  the  patient  is  generic  -  it  is  just  a  model  of  the  generalized  human.  The  process  of 
diagnosis  results  in  a  gradual  modification  of  the  user’s  copy  of  the  patient’s  profile  so 
that  in  the  case  of  successful  diagnosis,  it  closely  resembles  the  actual  physiological 
model  of  the  patient,  at  least,  with  respect  to  the  properties  relevant  to  the  patient’s 
complaint.  A  good  analog  to  this  process  of  gradual  uncovering  of  the  user  profile  is  the 
game  of  Battleship,  where  the  players  gradually  determine  the  positions  of  their 
opponent’s  ships  on  a  grid. 

At  any  point  during  the  management  of  the  patient,  the  user  may  prescribe 
treatments.  In  other  words,  the  system  allows  the  user  not  only  to  issue  queries  but  also  to 
intervene  in  the  simulation,  changing  property  values  within  the  patient.  Any  single 
change  can  induce  other  changes  -  that  is,  the  operation  of  an  agent  can  at  any  time 
activate  the  operation  of  another  agent. 

Simulation:  Utility 

The  MVP  project  can  be  viewed  as  just  one  of  a  number  of  applications  in  the 
area  of  intelligent  clinical  systems.  The  latter,  in  turn,  can  be  viewed  as  one  of  the 
possible  domains  in  which  one  can  apply  modeling  teams  of  intelligent  agents  featuring  a 
combination  of  physical  system  simulation  and  cognitive  processing.  So,  in  the  most 
general  terms,  our  work  can  be  viewed  as  devoted  to  creating  working  models  of 
societies  of  artificial  intelligent  agents  that  share  a  simulated  “world”  of  an  application 
domain  with  humans  in  order  to  jointly  perform  cognitive  tasks  that  have  until  now  been 
performed  exclusively  by  humans.  Sample  applications  of  such  models  include: 
o  a  team  of  medical  professionals  diagnosing  and  treating  a  patient  (with  humans 
playing  the  role  of  either  a  physician  or  a  patient) 

o  a  team  of  intelligence  or  business  analysts  collecting  information,  reasoning  about 
it  and  generating  analyses  or  recommendations  (with  humans  playing  the  role  of 
team  leader) 
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o  a  team  of  engineers  designing  or  operating  a  physical  plant  (with  humans  playing 
the  role  of  team  leader) 

o  a  learning  environment  (where  humans  play  the  role  of  students). 

As  can  be  seen,  this  work  is  at  the  confluence  of  several  lines  of  research  -  cognitive 
modeling,  ontological  engineering,  reasoning  systems,  multi-agent  systems,  simulation 
and  natural  language  processing. 

Simulation:  Accomplishments 

In  Year  4  of  the  project,  our  team  has  delivered  two  new  versions  of  the  Maryland 
Virtual  Patient  Environment.  The  realism  of  the  simulation  has  been  enhanced  by 
including  coverage  of  “unexpected”  interventions;  allowing  discontinued  treatments; 
allowing  new  diseases  to  develop  due  to  side  effects  of  treatments.  The  user  interface  has 
been  redesigned.  A  new  agent-based  architecture  has  been  developed  to  support  enhanced 
cognitive  capabilities  of  the  virtual  patient  and  the  intelligent  tutor,  including  language 
capabilities.  In  the  area  of  language  processing,  a  dialog  processing  model  was 
developed.  Work  has  continued  on  improving  the  language  understanding  capabilities, 
centrally  including  treatment  of  referring  expressions.  Enhancement  of  static  knowledge 
resources,  the  ontology  and  the  lexicon,  has  been  ongoing.  Work  on  extending  the 
coverage  of  diseases  has  been  ongoing:  a  further  improvement  of  the  model  of  GERD  is 
under  way,  as  is  the  modeling  of  cardiovascular  diseases.  A  totally  reworked  system 
version,  with  dialog  support,  is  planned  for  release  in  June  2008.  Work  has  also  been 
ongoing  on  improving  and  extending  the  set  of  development  tools  -  the  DEKADE 
demonstration,  evaluation  and  knowledge  acquisition  environment  supporting  natural 
language  work  has  been  revamped;  the  interface  for  creating  instances  of  virtual  patients 
has  also  been  enhanced;  a  web-based  environment  for  supporting  internal  documentation 
has  been  installed.  Finally,  we  have  written,  submitted,  published  or  delivered  6 
conference  and  journal  papers. 

C.  Smart  Image 

Introduction 

The  overall  objective  of  this  “Smart  Image”  project  has  been  to  demonstrate  the 
technical  feasibility  of  live  augmented  reality  (AR),  which  is  the  fusion  of  live  volumetric 
computed  tomography  (CT)-generated  views  of  the  surgical  field  with  laparoscopic 
views.  The  advantage  of  live  AR  is  that  internal  structures  absent  in  laparoscopic  views 
can  be  visualized  using  CT.  Being  able  to  see  internal  structures,  especially  underlying 
vessels,  before  making  a  dissection  has  been  a  longstanding  need  of  minimally  invasive 
surgeons. 

Although  the  proposed  use  of  continuous  volumetric  CT  is  advantageous  to 
creating  renderings  of  the  internal  structures  with  their  orientations  refreshed  at  a  rapid 
rate  (appox.  1  Hz),  it  is  also  imperative  that  new  technologies  be  created  to  reduce  the 
radiation  dose  to  the  patient  and  the  surgeon  alike  originating  from  the  use  of  CT.  Our 
first  three  objectives  address  the  radiation  dose  problem  while  providing  a  means  to 
visualize  the  vasculature  throughout  the  surgery  with  a  single  administration  of  the  CT 
contrast  agent.  The  fourth  objective  is  devoted  to  creating  spatially  and  temporally 
synchronized  AR  views. 
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Objective  1:  Dose  reduction  strategy:  Registration  between  high  dose-low  dose  CT 

This  objective  has  allowed  us  to  track  tissue  motion  through  the  use  of  continuous 
intra-operative  volumetric  CT  scans  acquired  at  ultralow  doses.  Because  intra-operative 
ultralow  dose  CT  scans  are  not  contrast  enhanced  and  have  low  image  quality,  we 
successfully  tested  the  concept  of  first  acquiring  a  diagnostic  quality  contrast-enhanced 
CT  scan  immediately  prior  to  the  surgery  and  then  modifying  (i.e.,  warping)  it  to  spatially 
register  with  low-dose  CT.  The  net  result  is  that  high-quality  contrast-enhanced  CT 
images  of  the  surgical  field  are  available  throughout  the  surgery. 

We  earlier  demonstrated  this  capability  using  simulated  low-dose  CT  images 
(low-dose  image  simulated  from  a  high-dose  scan)  [1],  In  this  last  project  year,  we  re¬ 
tested  this  capability  with  actual  high-  and  low-dose  CT  scans  of  a  pig  liver.  To  test  the 
accuracy  of  registration,  we  further  placed  5-6  tiny  markers  (2-3  mm  segments  of  a  thin 
guidewire)  in  the  liver  parenchyma.  The  results  of  registration  for  a  representative  case 
are  shown  in  Figure  1.  A  better  structural  alignment  is  seen  after  registration. 


Figure  1.  Superposition  of  initial  CT  and  intra-operative  CT  (acquired  at  200  mAs) 
before  (left)  and  after  (right)  nonrigid  registration.  Note  better  structure  alignment  after 

registration. 

We  have  further  investigated  the  accuracy  of  registration  quantitatively  using 
implanted  markers.  The  target  registration  error  (or  misalignment)  before  and  after 
registration  are  shown  in  Table  1.  Overall,  the  registration  was  able  to  reduce  ~3.5  mm  of 
initial  misalignment  to  an  acceptable  approx.  1.5  mm  for  all  dose  levels.  This  also 
indicates  that  intra-operative  CT  can  be  conducted  at  25  mAs,  which  is  roughly  10  times 
smaller  than  the  diagnostic  dose.  An  abstract  accepted  for  presentation  at  the  June  2009 
meeting  of  Computer  Assisted  Radiology  and  Surgery  (CARS)  conference  on  this  topic  is 
included  with  this  report. 


Tab 


e  1.  Image  misalignment  before  and  after  registration  for  helical  scans. 


Dose  (mAs) 

Initial 

Misalignment 

(mm) 

Misalignment  after 
Registration  (mm) 

200  (high) 

3.12 

1.47 
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75  (medium) 

3.63 

1.67 

25  (low) 

3.25 

1.45 

Note  that  the  potential  to  further  reduce  the  dose  exists.  First,  the  25  mAs  setting 
is  the  lowest  setting  that  our  current  CT  scanner  allows.  As  confirmed  by  our  earlier 
simulation  study,  registration  is  expected  to  work  even  at  10  mAs.  We  plan  to  explore 
this  in  the  future.  Replacing  the  standard  reconstruction  technique  with  an  iterative 
technique,  as  explained  below,  offers  the  potential  to  further  reduce  the  dose. 

Objective  2:  Dose  reduction  strategy:  Iterative  reconstruction 

Iterative  reconstruction  techniques  are  known  to  give  better  quality  image 
reconstructions  in  the  presence  of  noise.  Hence  these  techniques  are  better  suited  for 
reconstruction  of  low  dose  images.  Over  the  past  year,  we  implemented  the  iterative 
Paraboloidal  Surrogate  (PS)  algorithm  and  compared  the  reconstruction  performed  with  it 
with  those  from  the  standard  filtered  backprojection  (FBP)  algorithm  that  clinical  CT 
scanners  use.  As  iterative  techniques  are  generally  computationally  intensive,  we  also 
accelerated  the  PS  algorithm  on  a  cluster  of  CPUs  and  also  a  graphics  processing  unit 
(GPU). 


Figure  2.  Reconstruction  of  low  25  mAs  data  using  the  standard  FBP  method  (left)  and 
iterative  PS  method  (right).  The  right  image  is  sharper  and  less  noisy. 

Using  actual  low-dose  scanner  data,  we  found  PS  algorithm  provided  improved 
image  quality  as  seen  in  Figure  2.  Figure  3  has  a  plot  of  image  quality,  measured  as 
PSNR,  versus  CT  dose  for  both  the  algorithms.  Although  an  absolute  comparison 
between  the  two  techniques  is  difficult  using  PSNR,  we  observed  that  the  image  quality 
drops  precipitously  with  the  lowering  of  the  dose  with  the  standard  FBP  algorithm, 
whereas  the  image  quality  stays  unchanged  with  dose  for  the  iterative  PS  algorithm. 
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Figure  3.  Image  quality  (in  terms  of  PSNR)  versus  radiation  dose  (in  terms  of  x-ray  tube 
current)  for  the  two  reconstruction  algorithms. 

The  time  of  reconstruction  is  variable  with  the  image  size.  For  512  x  512  image 
matrix,  which  is  the  standard  size  of  clinical  CT  images,  the  GPU  is  capable  of 
performing  1  reconstruction  per  second.  This  speed  is  comparable  to  the  reconstruction 
speed  on  most  CT  scanners.  Two  abstracts  were  presented  on  this  work  during  the  last 
project  year.  The  IEEE  Nuclear  Science  Symposium  and  Medical  Imaging  Conference 
abstract  is  attached  with  this  report. 

Objective  3:  High-speed  implementation  of  non-rigid  registration 

Non-rigid  registration  between  intra-operative  CT  and  initial  CT  images  is  a 
fundamental  need  for  live  AR.  Computational  complexity  of  non-rigid  registration, 
however,  is  high  and  was  addressed  by  this  technical  objective. 

We  earlier  reported  publishing  of  the  initial  architectural  design  for  accelerated 
non-rigid  image  registration  [2],  This  architecture  is  capable  of  calculating  mutual 
information,  a  compute  intensive  step  in  intensity-based  image  registration,  approx.  40- 
times  faster  than  a  software-based  implementation,  and  can  reduce  the  execution  time  of 
non-rigid  registration  from  hours  to  minutes.  We  also  fully  tested  this  implementation 
using  high-  and  low-dose  CT  images. 

In  the  last  year,  our  efforts  were  directed  at  showing  the  optimality  of  our 
hardware  implementation.  A  full-length  paper  on  this  work  is  in  the  press  and  is  included 
with  this  report. 

Objective  4:  Tracking  and  visualization 

We  have  presented  in  detail  in  prior  reports  our  methods  to  link  CT  and 
laparoscope  coordinate  systems  using  optical  tracking,  which  is  a  prerequisite  for  creating 
AR  views.  Working  systematically  with  phantoms  and  ex-vivo  specimens,  we  devoted 
considerable  effort  on  identifying  the  sources  of  spatial  misregistration  in  merging  CT 
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and  laparoscopic  views  and  removing  those.  After  perfecting  spatial  calibration,  we 
conducted  two  additional  animal  (pig)  experiments  in  which  we  collected  the  necessary 
CT  and  laparoscopic  data  for  creating  live  AR  visualization.  All  methods  were  finalized 
and  all  new  data  have  been  processed.  Figures  4  and  5  show  new,  spatially  well 
registered  CT  and  laparoscopic  views.  Good  spatial  registration  can  also  be  verified  by 
the  co-location  of  two  markers  we  had  placed  on  the  surface  of  the  liver  in  the  two 
modalities.  An  abstract  summarizing  our  overall  live  AR  concept  and  initial  results  has 
been  accepted  for  presentation  at  the  annual  meeting  of  the  Society  of  American 
Gastrointestinal  and  Endoscopic  Surgeons  in  April  2009. 


Figure  4.  Side-by-side  optical  (laparoscopic)  and  CT  views.  A  good  overall  alignment 
between  anatomic  structures  and  two  fiducial  markers  can  be  observed,  indicating  high- 
accuracy  augmented  reality  if  the  two  views  are  superimposed. 
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Figure  5.  A  snapshot  of  rendered  CT  showing  internal  structures  overlaid  on  the 

laparoscopic  image. 


References 

[1]  O.  Dandekar,  K.  Siddiqui,  V.  Walimbe,  and  R.  Shekhar,  "Image  registration  accuracy 
with  low-dose  CT:  how  low  can  we  go?,"  in  3rd  IEEE  International  Symposium  on 
Biomedical  Imaging:  Nano  to  Macro,  2006,  pp.  502-505. 

[2]  O.  Dandekar  and  R.  Shekhar,  "FPGA-accelerated  Deformable  Registration  for 
Improved  Target-delineation  During  CT-guided  Interventions,"  IEEE  Transactions  on 
Biomedical  Circuits  and  Systems,  vol.  1(2),  pp.  116-127,  2007. 

C.2.  Smart  Image:  Image  Pipeline 
Smart  Image:  Image  Pipeline 

During  the  last  phase  of  the  contract,  the  following  goals  have  guided  the  work  of 
our  group: 

❖  Continued  development  and  evaluation  of  registration  techniques  for  integrating 
the  view  from  the  laparoscope  (visual  texture)  onto  a  3D  surface  model  of  target 
organs. (global  3D  shape)  without  pose  tracking. 

❖  Exploration  of  techniques  for  integrating  full  volumetric  models,  surface  models, 
and  surface  texture. 

❖  Initiation  of  work  on  interpolation  of  missing  or  obscured  image  parts  by  use  of 
statistical  models,  and  on  the  incorporation  of  deformation  models  to  pre-  or  itnra- 
operative  volumetric  data  to  allow  better  registration  of  intra-operative  video 
sequences  to  these  3D  data  sources. 
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❖  Development  and  evaluation  of  a  visualization  test  environment  -  the  Dual 

Display  Environment  -  to  allow  us  to  determine  user  preferences  and  to  document 
usability  enhancements  likely  to  result  from  our  “smart  image”  advances. 

Texturing  3D  Surface  Models  Without  Pose  Tracking 

This  research  is  driven  by  the  need  to  increase  the  surgeon's  field  of  view  and  to 
add  depth  cues  to  laparoscopic  images.  Our  approach  is  to  find  software  solutions  that 
utilize  data  from  the  scope  itself  rather  than  by  adding  additional  cameras  or  other  sensors 
into  the  surgical  environment.  This  assumes  that  3D  models  will  eventually  be  available 
from  scope  data,  and  we  have  in  fact  explored  some  possible  ways  of  addressing  this 
challenge  (i.e.,  Light  Falloff  Stereo,  see  previous  annual  report).  We  have  focused 
instead  on  determining  how  to  match  a  series  of  views  from  the  laparoscope  to  a  3D 
surface  model  of  target  anatomy,  specifically  using  feature  tracking  to  stitch  together 
sequential,  overlapping  images.  Although  this  task  can  be  accomplished  through  the 
tracking  of  camera  parameters,  we  have  discussed  a  number  of  limitations  with  this 
approach,  including  accuracy  concerns  associated  with  the  offset  between  the  scope's  end 
and  the  motion  tracking  sensors.  However,  in  the  long  run,  we  anticipate  that  our  method 
might  be  combined  with  other  tracking  schemes  to  enhance  reliability  of  the  texturing 
process  over  that  achieved  by  any  single  method  in  isolation. 

We  presented  a  poster  at  the  ACM  SIGGRAPH  International  Symposium  for 
Interactive  3D  graphics  (I3D),  shown  below,  that  detailed  our  procedures.  This  poster 
illustrated  the  application  of  our  technique,  which  is  presently  semi-automated,  requiring 
that  surgeons  match  the  first  set  of  2D  and  3d  features,  to  faces,  teeth,  and  a  liver. 
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Introduction 

We  present  a  system  for  mapping  video  images  onto  its  conesponding 
3D  model  Instead  of  tracking  the  camera  or  solving  the  difficult  3D 
to  2D  registration  problem,  w*  pose  it  as  an  incremental  optimization 
problem  that  only  requires  2D-to>2D  matching 
Framework 


Algorithm  descnption 

Initial  Mapping:  The  3D  modelis  parameterized  over  the  image 
using  an  optmuzation-basedmethod  while  maintaining  user-defined 
feature  correspondences  between  them 

Feature  Extraction  and  Matching:  Distinctive  features  points  are 
extracted  from  the  set  of  images  and  tracked  over  time  using  state-of- 
the-art  tracking  algorithm  such  as  KLT or  SIFT. 

Image  Registration:  The  multiple  imagestaken  from  different 
viewpoints  are  stitched  together  by  using  tracked  features. 

Video  Sequence  Mapping:  This  is  the  hean  of  algorithm  The 
images  from  video  sequence  are  mapped  ontothe  geometry 
incrementally 


Results  of  texture  mapping  (a)  Pig  liver  (b)  Human  face  (c)  Human  teeth,  (d)  Bookend 


In  order  to  better  evaluate  our  technique  against  the  traditional  pose  tracking 
method,  we  collected  video  segments  from  real  anatomical  training  models  during  the 
first  quarter  of  the  year.  However,  we  were  faced  with  technical  challenges  when 
developing  the  baseline  against  which  to  compare  our  new  technique.  Specifically,  the 
Faro  Arm  used  for  tracking  camera  pose  was  either  broken  or  off  site  for  extended 
periods,  and  we  also  had  to  develop  a  method  of  rigidly  mounting  our  Stryker  1088 
endoscope  to  the  tracking  arm.  We  resolved  those  issues  during  the  second  quarter  and 
conducted  our  evaluations,  finding  that  our  method  resulted  in  less  registration  error  than 
the  traditional  method.  These  data  were  written  up  in  a  paper  submitted  to  IEEE 
Trans.Medical  Imaging  which  we  are  currently  revising  after  the  first  round  of  largely 
positive  reviews.  The  paper  is  appended,  and  the  following  are  two  views  of  our  texture¬ 
mapping  results  applied  to  a  training  model  of  an  intestine.  The  3-d  model  is  presented 
on  the  right,  and  the  textured  result  is  provided  on  the  left. 
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Extending  Methods  to  Deformable  Objects 

Our  current  method  assumes  that  a  rigid  3D  model  is  known  a  priori  (e.g.,  from 
pre-op  imaging).  Our  plan  is  to  extend  this  method  to  handle  deformable  objects  without 
requiring  the  knowledge  of  a  3D  model  as  input.  The  basic  idea  is  to  use  machine 
learning  methods  to  find  a  mapping  between  the  2D  appearance  and  its  3D  structure  of  a 
3D  model.  We  will  first  collect  a  training  data  set  of  a  deformable  model  with  known  2D 
to  3D  correspondences.  This  can  be  obtained  via  simulation  data.  Then  non-linear 
dimension  reduction  techniques  can  be  applied  to  automatically  find  the  mapping.  After 
the  training  is  done,  we  will  evaluate  its  effectiveness  with  real  data.  Ultimately  the  goal 
is  to  create  complete  3D  models  from  a  monocular  video  sequence. 

Based  on  this  work,  a  paper  was  submitted  to  ISBQ  2009  (under  review).  It 
focuses  on  using  a  geometric  deformation  model  to  guide  registration  of  a  3D  model  from 
pre-operative  images  to  a  small  set  of  reconstructed  3D  surface  landmarks  from  a 
laparoscopic  video  sequence.  Intra-operative  tissue  deformations  of  the  target  object  are 
used  to  estimate  deformations  of  important  substructures,  such  as  a  tumor  or  a  group  of 
vessels,  of  the  target  object.  Our  method  only  requires  a  small  number  of  reconstructed 
surface  landmarks  on  the  object,  and  it  does  not  require  modeling  material  properties. 
Evaluation  of  results  of  the  registration  and  the  deformation  estimation  show  that  our 
method  performs  well  on  the  type  of  soft  tissue  deformations  in  our  driving  application  - 
laparoscopic  renal  cryoablation. 

The  following  are  examples  of  deformation  models,  including  bending  (first  two 
rows)  and  twisting  (last  row): 
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Visualization  Methods  and  Cognitive  Ergonomics 

At  the  same  time  that  we  are  developing  better  ways  to  create  3d  models  and 
integrate  them  with  scope-acquired  images,  we  are  also  trying  to  determine  the  most 
effective  ways  to  view  the  information.  We  are  giving  special  emphasis  to  integrating 
local  information  (scope  view  texture)  and  global  information  (3d  panoramas)  to 
optimize  performance  in  a  variety  of  surgical  navigation  tasks  (e.g.,  navigation  of  the 
scope  itself,  navigation  of  instruments,  searching  for  target  “terrain”).  To  this  end,  we 
have  proposed  a  dual  display  environment,  in  which  the  scope  view  and  the  global  view 
are  rendered  side  by  side  or,  in  some  cases,  with  one  superimposed  on  the  other.  In  order 
to  test  the  effectiveness  of  various  configurations  we  have  begun  by  getting  feedback  on 
early  prototypes.  This  was  done,  first,  during  a  visit  of  the  team  to  UMMC  in  the  first 
quarter  of  the  year.  We  were  able  to  get  input  from  one  fellow,  2  residents,  and  4 
surgeons/attending  surgeons.  We  also  introduced  an  early  prototype  of  the  visualization 
environment  at  the  2008  SAGES  conference  where  surgeons  viewing  our  demonstration 
provided  additional  feedback. 

In  the  final  phase  of  this  project,  we  began  developing  a  usability  testing 
simulation  for  the  visualization  techniques.  Our  goal  was  to  have  a  simulation  that 
supports  appropriate  test  tasks  that  can  be  used  in  future  usability  testing  for  a  variety  of 
variations  on  the  basic  dual  display  visualization  scheme.  The  target  objects  in  the 
simulation  can  vary  from  objects  familiar  to  nonspecialists  (e.g.,  human  faces)  to  livers, 
kidneys,  and  other  anatomical  structures.  A  screen  shot  from  the  simulation  of  a  human 
face  is  presented  below.  The  left  side  shows  the  global,  perspective  view;  the  left  side 
shows  a  simulation  of  what  would  be  seen  through  the  “scope.”  The  small  cone  in  the 
global  view  shows  the  location  of  the  scope.  A  series  of  screen  shots  is  also  shown  using 
a  model  of  a  kidney.  These  screen  shots  show  other  interactions  that  the  surgeon  can 
have  with  the  simulation,  including  manipulating  the  global  view,  and  varying  the 
transparency  of  surface  structures. 
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Experimental  tasks  are  based  on  a  variety  of  surgical  navigation  tasks  that  are 
likely  to  be  aided  by  providing  the  surgeon  with  a  global  view  in  addition  to  detailed 
views.  Some  tasks  are  relatively  simple,  for  example,  finding  a  target  symbol  hidden 
amongst  other  structures.  In  this  circumstance,  the  global  view  will  provide  a  way  for 
participants  to  keep  track  of  where  they  have  already  searched.  Other  tasks  involve 
finding  the  shortest  path  between  landmarks  and  retracing  one’s  path.  The  relative 
location  of  our  component  displays  (global  vs.  local),  color  coding,  and  method  of 
integration  can  all  be  tested  to  determine  differences  in  search  time,  search  efficiency, 
and  perception  of  mental  workload.  The  simulation  uses  a  phantom  desktop  haptic 
display  device  to  allow  a  research  participant  to  manipulate  the  simulated  “scope.”  The 
system  allows  for  the  acquisition  of  search  times,  search  efficiency  metrics,  and  search 
similarity  metrics.  Discussions  of  the  proposed  advantages  of  the  dual  display 
framework  have  been  discussed  at  the  2008  ISE  conference,  and  another  paper  is 
currently  under  review. 

Papers  and  Presentations 

Carswell,  Melody,  Han,  Qiong,  and  Lio,  Cindy  (2008).  Cognitive  Ergonomics  and 
Display  Development  for  Minimally  Invasive  Surgery.  Poster  presented  at  SAGES  2008: 
Society  of  Gastrointestinal  and  Endoscopic  Surgeons.  April  2008,  Philadelphia,  PA. 

Carswell,  C.M.  (2008).  Surgical  Navaids.  Presented  at  Innovations  in  the  Surgical 
Environment  Conference,  Baltimore,  MD  June. 

Han,  Q.,  Carswell,  M.,  Seales,  W.,  and  Strap,  S.  (in  submission  to  ISBI  09).  Model- 
based  Estimation  and  Visualization  of  global  object  deformations  for  laparoscopic  renal 
cryoablation. 

Yang,  Ruigang,  Carswell,  Melody,  Wang,  Xianwang,  Zhang,  Qing,  Han,  Qiong,  Seales, 
Brent,  and  Lio,  Cindy,  (in  submission).  A  dual  display  framework  for  laparoscopy: 
Mapping  the  way.  Surgical  Innovations. 

Wang,  Xianwang,  Zhang,  Qing.,  Yang,  Ruigang,  Seales,  Brent,  and  Carswell,  Melody. 
(2008).  Feature-based  texture  mapping  from  video  sequence  Presented  at  the  2008 
symposium  on  Interactive  3D  graphics  and  games,  ACM  SIGGRAPH. 

Wang,  Xianwang,  Zhang,  Qing,  Han,  Qiong,  Yang,  Ruigang,  Carswell,  M.,  and  Seales, 
B.  (in  submission).  Endoscopic  video  texture  mapping  on  pre-built  3D  anatomical 
objects  without  camera  tracking.  In  revision  for  IEEE  Transactions  on  Medical  Imaging. 


Key  Research  Accomplishments 

A.  Informatics 

Informatics  subgroup  1.  Perioperative  Scheduling  Study 

•  A  mathematical  model  was  developed  to  evaluate  congestion  in  the  perioperative  units. 
This  model  now  drives  medical  center  planning  and  patient  movement  activity. 

Informatics  subgroup  3.  Operating  Room  Glitch  Analysis 
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•  A  user  interface  (dashboard)  was  developed,  containing  a  data  manipulation  layer  that 
allows  interaction  with  the  data  warehouse  and  provides  analysts  with  means  to  create 
and  track  new  metrics,  a  visualization  toolkit  to  create  graphs  and  web  pages  to  display 
information  effectively,  and  semantic  services  which  generate  data  drilling  functionality 
for  graphs  and  data  tables  without  the  need  for  extra  programming  or  user  configuration. 
This  resulted  in  a  monitoring  system  of  perioperative  activity  that  is  used  daily  in  the 
medical  center. 

Informatics  subgroup  3.  Video  Summarization 

An  effort  evolved  from  the  CAST  system  that  led  to  the  initiation  of  efforts  to 
develop  a  summarization  system  for  video  events  in  the  operating  room.  The  research 
team  compared  image  features  with  a  distance  metric  to  identify  the  critical  view  of  a 
laparoscopic  cholecystectomy.  Initial  results  were  promising,  but  more  work  needs  to  be 
done  to  increase  accuracy.  We  are  currently  experimenting  with  particle  analysis,  edge 
analysis,  and  support  vector  machines  as  ways  to  create  a  more  robust  image  classifier 

B,  Simulation  (Virtual  Patient) 

Over  the  course  of  this  project,  the  research  team  has  delivered  two  new  versions 
of  the  Maryland  Virtual  Patient  Environment.  This  includes  coverage  of  “unexpected” 
interventions,  discontinued  treatments,  and  new  diseases  to  develop  due  to  side  effects  of 
treatments.  The  user  interface  has  been  redesigned  and  a  new  agent-based  architecture 
has  been  developed  to  support  enhanced  cognitive  capabilities  of  the  virtual  patient  and 
the  intelligent  tutor,  including  language  capabilities.  In  the  area  of  language  processing, 
a  dialog  processing  model  was  developed.  A  totally  reworked  system  was  completed  in 
2008  with  significant  enhancements  in  coverage  of  diseases. 

C. l.  Smart  Image:  CT  guided  imaging 

The  highlights  of  this  research  effort  include  the  demonstration  of  the  feasibility 
of  deformable  image  registration  with  low-dose  CT,  demonstration  of  the  potential  for  up 
to  20-fold  reduction  in  radiation  dose,  the  development  of  an  iterative  reconstruction 
algorithm  for  reconstruction  of  low-dose  CT  images.  This  implementation  offers 
improved  image  quality  at  low-dose  when  compared  with  scanner-based  reconstruction. 
Additionally,  this  project  saw  the  development  of  an  FPGA-based  architecture  for 
accelerated  implementation  of  deformable  registration  algorithm.  This  architecture  is 
capable  of  providing  40-fold  speedup  for  image  registration.  The  team  achieved  the 
capability  of  performing  rigid  registration  (first  step  to  deformable  image 
registration)  under  1  minute. 

C.2.  Smart  Image:  Image  Pipeline 

The  highlights  of  this  research  program  included  the  development  of  a  new 
method  of  intra-operative  registration  that  relies  on  feature-based  texture  mapping  to 
spatially  integrate  video  sequences  into  panoramas,  the  publishing  of  initial  technical 
evaluations  on  the  new  registration  technique,  the  development  of  “baseline”  registration 
samples  using  more  traditional  (camera  tracking)  procedures  against  which  to  compare 
our  new  technique  and  the  development  of  a  new  method  of  intra-operative  depth 
acquisition  dubbed  “light  fall-off  stereo.” 
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The  team  initiated  collaboration  with  Stryker  Endoscopy  to  integrate  light  fall-off 
stereo  into  a  prototype  endoscope  and  developed  design  concepts  for  visualization 
techniques  based  on  principles  of  cognitive  ergonomics. 


Reportable  Outcomes 

Informatics  subgroup  1.  WORQ 

We  have  now  completed  a  mathematical  model  for  evaluating  congestion  in  the 
postoperative  units,  including  ICUs,  IMCs  and  floor  units.  This  model  is  currently  in  use 
within  the  medical  center  to  predict  patient  flow. 

Informatics  subgroup  2.  OGA 

The  clinical  information  system  developed  by  this  project  is  in  regular  use  by  the 
hospital  to  monitor  operating  room  delay  and  consequent  patient  flow.  A  live  dashboard 
portraying  OR  timing  is  on  the  desk  of  the  Heads  of  Surgery,  Radiology  and 
Anethesiology. 

Simulation 

The  Maryland  Virtual  Patient  has  been  developed  into  the  only  known  cognitive 
simulation  training  system.  Specific  clinical  aspects  of  LERD/GERD  can  be  taught  with 
realism  and  accuracy. 

Programmatic 

Overall,  the  research  program,  Innovations  in  the  Surgical  Environment,  has 
produced  a  significant  growth  in  the  scientific  knowledge  pertaining  to  the  surgery  milieu. 
The  annual  research  conference  has  matured  from  a  local,  Pi-update  type  of  meeting  to 
an  important  and  well-respected  international  scientific  event.  Attendees  to  this  meeting 
include  representatives  of  the  highest  national  health  agencies,  the  National  Institutes  of 
Health,  major  academic  settings  and  military  organizations.  More  than  two  hundred 
publications  and  presentations  have  been  generated  directly  from  research  conducted 
under  the  auspices  of  this  contract. 

Conclusion 

This  report  began  with  the  recognition  that  an  extraordinary  evolution  in  surgical 
care  has  occurred  caused  by  rapid  advances  in  technology  and  creative  approaches  to 
medicine.  The  increased  speed  and  power  of  computer  applications,  the  rise  of 
visualization  technologies  related  to  imaging  and  image  guidance,  improvement  in 
simulation-based  technologies  (tissue  properties,  tool-tissue  interaction,  graphics,  haptics, 
etc)  have  interacted  to  advance  the  practice  of  surgery.  However,  the  medical  profession 
lags  behind  other  applications  of  information  systems.  The  research  program  reported 
here  has  proceeded  under  the  mantle  of  “Operating  Room  of  the  Future”.  As  a  natural 
occurrence  in  the  outcome  of  lessons  learned  in  medicine,  we  replaced  that  theme 
with  the  more  appropriate  “Innovations  in  the  Surgical  Environment.” 
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There  are  three  major  portions  of  this  study;  OR  informatics,  simulation  research, 
and  smart  image.  Future  research  efforts  here  at  Maryland  will  incorporate  work  related 
to  cognitive  ergonomics  and  human  factors  as  these  impact  the  surgical  environment. 

The  purpose  of  the  OR  informatics  program  is  to  develop,  test,  and  deploy 
technologies  to  collect  real-time  data  about  key  tasks  and  process  elements  in  clinical 
operating  rooms.  We  have  established  testbeds  of  activities  in  both  simulated  and 
operational  environments.  We  are  currently  performing  tests  of  the  hardware,  refining 
software,  and  applying  lessons  learned  to  hospital  operational  functions.  The  objective  of 
Simulation  research  is  to  create  a  system  where  a  user  can  interact  with  a  virtual  human 
model  in  cognitive  simulation  and  have  the  virtual  human  respond  appropriately  to  user 
queries  and  interventions  in  clinical  situations,  with  a  focus  on  cognitive  decision  making 
and  judgment.  We  have  made  significant  strides  toward  realizing  these  goals.  The  MVP 
simulation  functions  well  for  esophageal  disorders,  and  is  continuing  to  expand  the 
repertoire  of  diseases  that  are  in  the  simulation  model. 

The  objective  of  smart  image  was  to  use  real-time  3D  ultrasonography  and  40- 
slice  highframe-rate  computed  tomography  (CT)  for  intraoperative  imaging  to  volume 
rendered  anatomy  from  the  perspective  of  the  endoscope.  We  are  combining  CT  and 
Ultrasound  to  overlay  image  and  data  to  enhance  the  performance  of  surgeons-intraining. 
We  have  carried  out  animate  model  testing  of  the  image  registration  with  great 
success.  We  continue  to  refine  and  expand  our  capability  through  hardware  and  software 
refinement. 

In  the  future,  OR  workspace  layout  would  be  optimized  through  ergonomic  data 
and  human  factors  analysis,  and  this  optimization  would  lead  to  the  establishment  of 
“best  practices”  for  an  array  of  surgical  operations.  Proper  layout  would  reduce  risks  of 
infection,  speed  operations,  and  reduce  fatigue  of  surgeons  and  staff,  all  elements  that 
could  contribute  to  a  reduction  in  AEs  and  improved  patient  safety. 

The  year  ahead  is  full  of  promise  for  refinements  in  the  use  of  informatics  to 
support  safe  and  efficient  operating  room  procedures,  the  use  of  simulation  to  improve 
and  accelerate  the  training  of  competent  surgeons,  and  the  blending  of  imaging 
capabilities  to  provide  clearer  and  safer  interactions  between  patient  and  surgeon. 


References 

Year  1  (2004) 

■  Shekhar  R,  Zagrodsky  V,  Garcia  M,  Thomas  JD:  Registration  of  real-time  3D 
ultrasound  images  of  the  heart  for  novel  3D  stress  echocardiography.  IEEE 
Transactions  on  Medical  Imaging  23(9),  p  1141-1149,  2004. 

■  Tang  M,  Shekhar  R,  Huang  D:  Mean  curvature  mapping  for  detection  of  corneal 
shape  abnormality.  IEEE  Transactions  on  Medical  Imaging  24(3),  p  424-428, 
2005. 

■  Castro-Pareja  CR,  Shekhar  R:  Hardware  acceleration  of  mutual  information-based 
3D  image  registration.  Journal  of  Imaging  Science  and  Technology  49(2),  p  1  OS- 
113,  2005. 


DAMD-1 7-03-2-0001 
Page  32 


■  Li  J,  Papachristou  C,  Shekhar  R:  An  FPGA-based  computing  platform  for  real¬ 
time  3D  medical  imaging  and  its  application  to  cone-beam  CT  reconstruction. 
Journal  of  Imaging  Science  and  Technology  49(3),  p  237-245,  2005. 

■  Zagrodsky  V,  Walimbe  V,  Castro-Pareja  CR,  Qin  JX,  Song  J-M,  Shekhar  R: 
Registration-assisted  segmentation  of  real-time  3-D  echocardiographic  data  using 
deformable  models.  IEEE  Transactions  on  Medical  Imaging  24(9),  1089-1099, 
2005. 

■  Shekhar  R,  Walimbe  V,  Raja  S,  Zagrodsky  V,  Kanvinde  M,  Wu  G,  Bybel  B: 
Automated  three-dimensional  elastic  registration  of  whole-body  PET  and  CT 
from  separate  or  combined  Scanners.  Journal  of  Nuclear  Medicine  46(9),  1488- 
1496,2005. 

■  Tang  M,  Shekhar  R,  Miranda  D,  Huang  D:  Characteristics  of  keratoconus  and 
pellucid  marginal  degeneration  in  mean  curvature  maps.  American  Journal  of 
Ophthalmology  140(6),  993-1001,  2005. 

■  Xiao  Y,  Seagull  JF,  Mackenzie  CF,  Klein  K.  Adaptive  Leadership  in  Trauma 
Resuscitation  Teams:  A  Grounded  Theory  Approach  to  Video  Analysis. 
Cognition  Technology  &  Work,  12:158-164,  2004  Anbari,  KK,  Garino  JP, 
Mackenzie,  CF.  Hemoglobin  substitutes.  Eur  Spine  J.  13:  Supp/1:  S76-82, 

■  Xiao  Y,  Mackenzie  CF.  Introduction  to  the  special  issue  on  video-based  research 
in  high  risk  settings:  Methodology  and  experiences.  Editorial.  Cognitive 
Technology  and  Work.  12:130,  2004. 

■  Hu  PF,  Xiao  Y,  Mackenzie  CF,  Seagull  FJ,  Brooks  T,  LaMonte  MP,  &  Gagliano 
D.  Many  to  One  to  Many  Telemedicine  Architecture  and  Applications  . 
Telemedicine  Journal  and  e-Health.  10  (Supplement  1),  S-39.  2004 

■  Xiao  Y,  Gagliano  D,  Hu  PF,  LaMonte  MP,  &  Mackenzie  CF.  Designing  mobile 
telemedicine  applications:  A  testbed  and  lessons  learned.  Telemedicine  Journal 
and  e-Health.  10  (Supplement  1),  S-l  19.  2004 

■  Mackenzie  CF,  Xiao  Y,  Horst  RL.  Video  Task  Analysis  in  High  Performance 
Teams.  Cognition  Technology  &Work,  12:  139-147,2004. 

■  J.  Caban  and  W.  B.  Seales,  "  Reconstruction  and  Enhancement  in  Monocular 
Laparoscopic  Imagery  in  Proceedings  of  Medicine  Meets  Virtual  Reality 
(MMVR)  2004,  Newport  Beach,  California,  January,  2004. 

■  W.  B.  Seales  and  Duncan  Clarke, "  Computing  Support  for  Information- Rich 
Laparoscopy  ,"  in  Surgical  Innovation,  Volume  12,  Number  4  (December),  2004. 

Year  2  (2005) 

■  Segan  R,  Park  A:  Training  Competent  Minimal  Access  Surgeons:  Review  of 
Tools,  Metrics,  and  Techniques  Across  the  Spectrum  of  Technology.  Surgical 
Technology  International  XIII.  Universal  Medical  Press.  2005;  pp  25-32. 

■  Jarrell  BE,  Mallott,  DE,  Raczek,  J,  Skinner,  C,  Jarrell,  K,  Shimko,  M: 
Observations  for  Electronic  Medical  Simulation:  The  Heuristic  Patient,  Surgical 
Innovation,  Volume  12,  No  1,  43-49,  March  2005 

■  Fullum  TM,  Kim  S,  Dan  D,  Turner  PL.  Laparoscopic  "Dome-down" 
cholecystectomy  with  the  LCS-5  Harmonic  scalpel.  JSLS.  2005  Mar;  9(l):51-7. 


DAMD-1 7-03-2-0001 
Page  33 


■  Siegel  EL,  KM  Siddiqui,  JP  Johnson,  BI  Reiner,  A  Musk,  Nagy  PG,  Safdar  N 
Compression  of  multislice  CT:  2D  vs  3D  JPEG2000  and  effects  of  slice  thickness. 
SPIE  :PACS  and  Imaging  Informatics  2005 

■  Nagy  PG,  Siegel  EL,  Reiner  BI,  Siddiqui  KM,  Dal  Molin  J.  Openrad:  an  open 
collaboratory  for  radiology  informaticians.  SPIE  :PACS  and  Imaging  Informatics 
2005 

■  Siegel  EL,  Musk  A,  Siddiqui  KM,  Reiner  BI,  Nagy  PG,  Safdar  N.  Impact  of  the 
use  of  dual  energy  subtraction  on  the  performance  of  a  computer-assisted 
diagnosis  system  in  the  detection  Proc.  SPIE  :PACS  and  Imaging  Informatics 
2005 

■  Nagy  P,  Schultz  T.  Chapter  16:  Storage  and  Arching,  In:  PACS:  A  Guide  to  the 
Digital  Revolution  2nd  Edition,  New  York,  New  York:  Springer,  2005 

■  Nagy  P,  George  I,  Bernstein  W,  Caban  J,  Klein  R,  Mezrich  R,  and  Park  A.  Radio 
frequency  identification  systems  technology  in  the  surgical  setting.  Surg  Innov 
13(1):  61-67,2006 

■  Dutton  R,  Ho  D,  Hu  P,  Mackenzie  CF,  Xiao  Y.  Decision  Making  by  Operating 
Room  Managers:  The  Burden  of  Changes.  Anesthesiology,  103:A1175.  2005. 
Mackenzie  CF  (Editorial).  Sacred  Cows:  Milking  the  controversies  in 
resuscitation.  Hosp.  Med  (London),  66:68-69,  2005. 

■  Dutton  R,  Hu  PF,  Mackenzie  CF,  Seebode  S,  Xiao  Y.  A  Continuous  Video 
Buffering  System  for  Recording  Unscheduled  Medical  Procedures. 
Anesthesiology,  103:A1241.  2005. 

■  Hu  PF,  Burlbaugh  M,  Xiao  Y,  Mackenzie  CF,  Voigt  R,  Brooks  T,  Fraser  L, 
Connolly  MR,  Herring  T.  Video  Infrastructure  and  Application  Design  Methods 
for  an  OR  of  the  Future.  Telemedicine  and  e-Health.  1 1(2),  211,  T3C2.  2005. 

■  Hu  PF,  Hu  H,  Seagull  JF,  Mackenzie  CF,  Voigt  R,  Martz  D,  Dutton  R,  Xiao  Y. 
Distributed  Video  Board:  Advanced  Telecommunication  System  for  Operation 
Room  Coordination.  Telemedicine  and  e-Health.  11(2),  248,  P28.  2005. 

■  Hu  PF,  Mackenzie  CF,  Xiao  Y,  Seagull  JF,  Lam  D,  Gagliano  D.  Mobile  Video 
Transfer  System  for  Homeland  Security  and  Disaster  Management.  Telemedicine 
and  e-Health.  1 1(2),  248, P29.  2005. 

■  Kavic  SM,  Segan  RD,  Turner  PL,  George  IM,  Park  AE.  Repair  of  a  complex 
foregut  hernia  aided  by  three-dimensional  surgical  reconstruction  (abstract).  Surg 
Endosc  Supplement  19  2005 

■  Segan  RD,  Kavic  SM,  George  IM,  Turner  PL,  Park  AE.  A  novel  conceptual 
model  of  the  current  classification  of  paraesophageal  hernias  using  dynamic  three- 
dimensional  Reconstruction  (abstract).  Surgic  Endosc  (Supplement  19).  2005 

■  Jesus  J.  Caban,  W.  Brent  Seales,  Adrian  Park,  Heterogeneous  Displays  for 
Surgery  and  Surgical  Simulation.  Proceedings  of  Medicine  Meets  Virtual  Reality 
13,  January  2005. 

■  Carswell,  C,  Clarke,  D.,  Seales,  W.,  “Assessing  Mental  Workload  during 
Laparoscopic  Surgery” 

■  Seales,  W.  and  Clarke,  D,  “Computing  Support  for  Information-Rich 
Laparoscopy” 

■  Lee  G.  et  al,  “Pilot  study  —  Correlation  between  postural  stability  and 
performance  time  during  fundamentals  of  laparoscopic  surgery” 


DAMD-1 7-03-2-0001 
Page  34 


■  Lee  G.  et.  al:  “  Jont  Kinematics  vary  with  performance  skills  during  laparoscopic 
exercise” 

■  Lee  G.  et  al:  “Postural  instability  does  not  necessarily  correlate  to  poor 
performance” 

■  Li  Ding,  Tim  Finin,  Anupam  Joshi,  Yun  Peng,  Rong  Pan  and  Pavan  Reddivari, 
Search  on  the  Semantic  Web,  IEEE  Computer,  October  2005. 

■  Anand  Patwardhan,  Filip  Perich,  Anupam  Joshi,  Tim  Finin  and  Yelena  Yesha, 
Active  Collaborations  for  Trustworthy  Data  Management  in  Ad  Hoc  Networks, 
2nd  IEEE  Int.  Conf.  on  Mobile  Ad-Hoc  and  Sensor  Systems,  7-10  November 
2005,  Washington  DC. 

■  Mark  Burstein,  Christoph  Bussler,  Tim  Finin,  Michael  Huhns,  Massimo  Paolucci, 
Amit  Sheth,  Stuart  Williams  and  Michal  Zaremba,  A  Semantic  Web  Services 
Architecture,  IEEE  Internet  Computing,  pp  52-61,  v9,  n5,  September/October 
2005. 

■  Li  Ding,  Rong  Pan,  Tim  Finin,  Anupam  Joshi,  Yun  Peng  and  Pranam  Kolari, 
Finding  and  Ranking  Knowledge  on  the  Semantic  Web,  Proc.  4th  Int.  Semantic 
Web  Conf.,  Galway  IE,  Nov.  2005. 

■  Akshay  Java,  Tim  Finin  and  Sergei  Nirenburg,  Integrating  Language 
Understanding  Agents  Into  the  Semantic  Web,  First  Int.  Symposium  on  Agents 
and  the  Semantic  Web,  2005  AAAI  Fall  Symposium  Series,  Arlington  VA,  4-6 
November,  2005. 

■  Harry  Chen,  Filip  Perich,  Tim  Finin  and  Anupam  Joshi,  The  SOUPA  Ontology 
for  Pervasive  Computing,  in  Ontologies  for  Agents:  Theory  and  Experiences, 
Valentina  Tamma,  Stephen  Cranefield,  Tim  Finin  and  Steven  Willmott  (Eds), 
Springer  Verlag,  June  2005. 

■  Valentina  Tamma,  Stephen  Cranefield,  Tim  Finin  and  Steven  Willmott  (Eds), 
Ontologies  for  Agents:  Theory  and  Experiences,  Springer  Verlag,  June  2005, 
ISBN:  3-7643-7237-0. 

■  Tim  Finin,  Li  Ding,  Rong  Pan,  Anupam  Joshi,  Pranam  Kolari,  Akshay  Java  and 
Yun  Peng,  "Swoogle:  Searching  for  knowledge  on  the  Semantic  Web",  Intelligent 
Systems  Demonstration,  Proc.  20th  National  Conf.  on  Artificial  Intelligence,  July 
2005. 

■  Filip  Perich,  Anupam  Joshi,  Yelena  Yesha  and  Tim  Finin,  "Collaborative  Joins  in 
a  Pervasive  Computing  Environment",  VLDB  Journal,  vl4  n2,  pp  182-196,  April 
2005. 

■  Li  Ding,  Tim  Finin  and  Anupam  Joshi,  Analyzing  Social  Networks  on  the 
Semantic  Web,  IEEE  Intelligent  Systems  (Trends  and  Controversies),  v9,  nl, 
Jan/Feb  2005. 

Year  3  (2006) 

■  Agarwal,  S.,  Joshi,  A.,  Finin,  T.  and  Yesha,  Y.  A  Pervasive  Computing  System 
for  the  Operating  Room  of  the  Future,  technical  report  TR-CS-06-xx,  Computer 
Science  and  Electrical  Engineering,  University  of  Maryland,  Baltimore  County, 
September  2006.  http://ebiquity.umbc.edu/paper/html/id/339/ 

■  Agarwal,  S.,  Context- Aware  System  to  Create  Electronic  Medical  Encounter 
Records,  M.S.  thesis,  Computer  Science  and  Electrical  Engineering,  University  of 


DAMD-1 7-03-2-0001 
Page  35 


Maryland,  Baltimore  County,  May  2006. 
http://ebiquity.umbc.edu/paper/html/id/338/ 

■  Agarwal,  S.,  Joshi,  A.,  Finin,  T.,  Ganous,  T.  and  Yesha,  Y.  Context-Aware 
System  to  Create  Electronic  Medical  Records,  technical  report  TR-CS-06-05, 
Computer  Science  and  Electrical  Engineering,  University  of  Maryland,  Baltimore 
County,  July  2006.  http://ebiquity.umbc.edu/paper/html/id/312/ 

■  Akshay  Java,  Tim  Finin  and  Sergei  Nirenburg,  Text  understanding  agents  and  the 
Semantic  Web,  39th  Hawaii  Int.  Conf.  on  System  Sciences,  Kauai  HI,  4-6 
January,  2006. 

■  Akshay  Java,  Tim  Finin,  and  Sergei  Nirenburg,  SemNews:  A  Semantic  News 
Framework,  Proceedings  of  the  21st  National  Conf.  on  Artificial  Intelligence,  July 
2006. 

■  Anand  Patwardhan,  Filip  Perich,  Anupam  Joshi,  Tim  Finin  and  Yelena  Yesha, 
Querying  in  packs:  Trustworthy  Data  Management  in  Ad  Hoc  Networks, 
International  Journal  of  Wireless  Information  Networks,  2006. 

■  Boanerges  Aleman-Meza,  Meenakshi  Nagarajan,  Cartic  Ramakrishnan,  Amit 
Sheth,  Budak  Arpinar,  Li  Ding,  Pranam  Kolari,  Anupam  Joshi,  and  Tim  Finin, 
"Semantic  Analytics  on  Social  Networks:  Experiences  in  Addressing  the  Problem 
of  Conflict  of  Interest  Detection",  WWW2006,  Edinburgh,  May  2006. 

■  Cynthia  Parr,  Andriy  Parafiynyk,  Joel  Sachs,  Li  Ding,  Sandor  Dombush,  Tim 
Finin,  Taowei  Wang,  and  Allan  Hollander,  Integrating  Ecoinformatics  Resources 
on  the  Semantic  Web,  poster  paper,  15th  International  World  Wide  Web  Conf., 
Edinburgh,  May  2006. 

■  Dandekar  O,  Walimbe  V,  Siddiqui  K,  Shekhar  R:  Image  registration  accuracy 
with  low-dose  CT :  How  low  can  we  go?  In  Proceedings  of  2006  IEEE 
International  Symposium  on  Biomedical  Imaging,  p  502-505. 

■  Dandekar  O,  Walimbe  V,  Shekhar  R:  Hardware  implementation  of  hierarchical 
volume  subdivision-based  elastic  registration.  In  Proceedings  of  the  28th  Annual 
International  Conference  of  the  IEEE  Engineering  in  Medicine  and  Biology 
Society  (IEEE  EMBC  2006),  p  1425-1428. 

■  Dandekar  O,  Castro-Pareja  CR,  Shekhar  R,"FPGA-based  Reconfigurable  3D 
Image  Pre-processing  for  Image  Guided  Interventions.  Journal  of  Real-Time 
Image  Processing.  (Submitted) 

■  Goodman  L,  Gulsun  M,  Washington  L,  Nagy  P,  Piacsek  K  Inherent  Variability  of 
CT  Lung  Nodule  Measurements  In  Vivo  Using  Semi  automated  Volumetric 
Measurements  Am  J  Roentgenol,  186:  989-994,  2006 

■  Hu  P,  Xiao  Y,  Ho  D,  Mackenzie  CF,  Hu  H,  Voigt  R,  Martz  D.  Advanced 
Visualization  Platform  for  Surgical  Operating  Room  Coordination:  Distributed 
Video  Board  System.  Surgical  Innovation.  13(2):  129-135.  2006 

■  Jarrell  B,  Nirenburg  S,  McShane  M,  Fantry  G,  Beale  S,  Mallott,  D,  Raczek  J: 
“Simulation  for  Teaching  Decision  Making  in  Medicine:  The  Next  Step”,  In 
Proceedings  of  Medicine  Meets  Virtual  Reality  :  IGT-Registration  &  Navigation, 
2007. 

■  Jim  Parker,  Anand  Patwardhan,  Filip  Perich,  Anupam  Joshi  and  Tim  Finin,  Trust 
in  Pervasive  Computing,  in  The  Handbook  of  Mobile  Middleware,  P.  Bellavista 
and  A.  Corradi  (eds.),  pp.  473-496,  CRC  Press,  May  2006. 


DAMD-1 7-03-2-0001 
Page  36 


■  Kavic  SM,  Segan  RD,  George  IM,  Turner  PL,  Roth  JS  and  Park  A.  Classification 
of  hiatal  hernias  using  dynamic  three-dimensional  reconstruction.  Surg 
Innovation.  2006;  13:  21-25. 

■  Kavic  SM,  Segan  RD,  George  IM,  Turner  PL,  Roth  JS,  Park  AE.  Classification 
of  hiatal  hernias  using  dynamic  three-dimensional  reconstruction,  Surg  Innov 
13(1):  49-52,  2006 

■  Kavic  SM,  Segan  RD,  George  IM,  Turner  PL,  Roth  JS,  Park  AE.  Classification  of 
hiatal  hernias  using  dynamic  three-dimensional  reconstruction,  Surg  Innov,  March 
2006. 

■  Lalana  Kagal  and  Tim  Finin,  Modeling  Conversation  Policies  using  Permissions 
and  Obligations,  Journal  of  Autonomous  Agents  and  Multi-Agent  Systems,  to 
appear,  2006. 

■  Nagy  P,  George  I,  Bernstein  W,  Caban  J,  Klein  R,  Mezrich  R,  and  Park  A.  Radio 
frequency  identification  systems  technology  in  the  surgical  setting.  Surg  Innov 
13(1):  61-67,2006 

■  Nagy  P,  George  I,  Bernstein  W,  Caban  J,  Klein  R,  Mezrich  R,  Park  A.  Radio 
frequency  identification  systems  technology  in  the  surgical  setting.  Surg  Innov 
13(1):  61-67,2006 

■  Nirenburg,  S.,  M.  McShane,  S.  Beale,  T.  O’Hara,  G.  Fantry,  J.  Raczek,  B.  Jarrell, 
“Cognitive  Simulation  in  Virtual  Patients”,  accepted  for  presentation  at  the  19th 
International  FLAIRS  Conference,  Melbourne  Beach,  Florida,  May  11-13,  2006. 

■  Pranam  Kolari  and  Tim  Finin,  A  Framework  for  Multi-Relational  Analytics  on 
the  Blogosphere,  (abstract),  Proceedings  of  the  21st  National  Conf.  on  Artificial 
Intelligence,  July  2006. 

■  Pranam  Kolari,  Akshay  Java  and  Tim  Finin,  "Characterizing  the  Splogosphere", 
Third  Annual  Workshop  on  Weblogging  Ecosystem:  Aggregation,  Analysis  and 
Dynamics,  held  in  conjunction  with  WWW2006,  Edinburgh,  May  2006. 

■  Pranam  Kolari,  Akshay  Java,  Tim  Finin,  Tim  Oates,  and  Anupam  Joshi,  Detecting 
Spam  Blogs:  A  Machine  Learning  Approach,  Proceedings  of  the  21st  National 
Conf.  on  Artificial  Intelligence,  July  2006. 

■  Pranam  Kolari,  Tim  Finin  and  Anupam  Joshi,  SVMs  for  the  Blogosphere:  Blog 
Identification  and  Spam  Detection,  AAAI  Spring  Symposium  on  Computational 
Approaches  to  Analyzing  Weblogs,  Stanford,  March  2006. 

■  Pranam  Kolari,  Tim  Finin,  Yelena  Yesha,  Kelly  Lyons,  Jen  Hawkins  and  Stephen 
Perelgut,  Policy  Management  of  Enterprise  Systems:  A  Requirements  Study, 
short  paper,  2006  IEEE  Workshop  on  Policy  for  Distributed  Systems  and 
Networks,  5-7  June  2006,  London,  Ontario  CN. 

■  Ross,  J.M.,  T.M.  Bayles,  C.  Parker,  S.L.Titus,  B.  Jarrell  and  J.  Raczek, 
Engineering  in  Health  Care  Multimedia  Curriculum  for  High  School  Technology 
Education,  accepted  for  presentation  in  the  K-12  Engineering  Outreach  Division 
of  the  2006  American  Society  for  Engineering  Education  Annual  Conference  & 
Exposition  in  Chicago,  IL,  June  2006 

■  Shekhar  R,  Dandekar  O,  Kavic  S,  George  I,  Mezrich  R,  Park  A,  "Development  of 
continuous  CT-guided  minimally  invasive  surgery,"  In  Proceedings  of  SPIE 
Medical  Imaging  2007:  Visualization  and  Image-Guided  Procedures,  vol.  6509, 
pp.  65090D,  2007 


DAMD-1 7-03-2-0001 
Page  37 


■  Shekhar  R,  Dandekar  O,  Kavic  S,  George  I,  Mezrich  R,  Park  A,  "Development  of 
16  continuous  CT-guided  minimally  invasive  surgery,"  In  Proceedings  of 
Medicine  Meets  Virtual  Reality:  IGT-Registration  &  Navigation,  2007 

■  Tim  Finin  and  Li  Ding,  Search  Engines  for  Semantic  Web  Knowledge, 
Proceedings  of  XTech  2006:  Building  Web  2.0,  16-19  May  2006. 

■  Vartak,  N.,  Protecting  the  privacy  of  RFID  tags.  (M.S.  thesis)  Computer  Science 
and  Electrical  Engineering,  University  of  Maryland,  Baltimore  County,  May 
2006. 

■  Walimbe  V,  Shekhar  R:  Automatic  elastic  image  registration  by  interpolation  of 
3D  rotations  and  translations  from  discrete  rigid-body  transformations,  Medical 
Image  Analysis  of  3D  rotations  and  translations  from  discrete  rigid-body 
transformations,  Medical  Image  Analysis 

■  Wasei  M,  Xiao  Y,  Wieringa  P,  Strader  M  ,  Hu  P,  Mackenzie  C.  Visualization  of 
Uncertainty  to  Support  Collaborative  Trajectory  Management  in  Hospital  Care. 

2006  Conference  of  International  Ergonomics  Association.372 1-3726.  2006 

■  Xiao  Y,  Strader  M,  Hu  P,  Wasei  M,  Wieringa  P.  Visualization  Techniques  for 
Collaborative  Trajectory  Management .  ACM  Conference  on  Human  Factors  in 
Computing  Systems,  pp.1547  -  1552.  2006 

■  Xiao  Y,  Wasei  M,  Hu  P,  Wieringa  P,  Dexter  F.  Dynamic  Management  in 
Perioperative  Processes:  A  Modeling  and  Visualization  Paradigm.  12th  IF  AC 
Symposium  on  Information  Control  Problems  in  Manufacturing.  (3)647-52.  2006 

Year  4  (2007) 

■  Dexter  F,  Epstein  RH,  Traub  RD,  &  Xiao  Y.  Making  Management  Decisions  on 
the  Day  of  Surgery  Based  on  Operating  Room  Efficiency  and  Patient  Waiting 
Times.  Anesthesiology,  101(6):  1444-1453.  2004 

■  Dexter  F,  Xiao  Y,  Dow  AJ,  Strader  MM,  Ho  D,  Wachtel  RE.  Coordination  of 
Appointments  for  Anesthesia  Care  Outside  of  Operating  Rooms  Using  an 
Enterprise  Wide  Scheduling  System.  Anesthesia  and  Analgesia.  105:1701-1710. 

2007 

■  Dexter  F,  Xiao  Y,  Dow  AJ,  Strader  MM,  Ho  D,  Wachtel  RE.  Coordination  of 
Appointments  for  Anesthesia  Care  Outside  of  Operating  Rooms  Using  an 
Enterprise  Wide  Scheduling  System.  Anesthesia  and  Analgesia.  105:1701-1710. 
2007 

■  Dutton  R,  Ho  D,  Hu  P,  Mackenzie  CF,  Xiao  Y.  Decision  Making  by  Operating 
Room  Managers:  The  Burden  of  Changes.  Anesthesiology,  103:A1175.  2005 

■  Dutton  R,  Hu  P,  Seagull  FJ,  Scalea  T,  Xiao  Y, .  Video  for  Operating  Room 
Coordination:  Will  the  Staff  Accept  It?.  Anesthesiology:  101:  A1389.  2004 

■  Dutton  R,  Hu  PF,  Mackenzie  CF,  Seebode  S,  Xiao  Y.  A  Continuous  Video 
Buffering  System  for  Recording  Unscheduled  Medical  Procedures. 
Anesthesiology,  103:A1241.  2005 

■  Gilbert  TB,  Hu  PF,  Martz  DG,  Jacobs  J,  Xiao  Y.  Utilization  of  Status  Monitoring 
Video  for  OR  Management.  Anesthesiology,  103:A1263.  2005 

■  Hu  P,  Seagull  FJ,  Mackenzie  CF,  Seebode  S,  Brooks  T,  XiaoY.  Techniques  for 
Ensuring  Privacy  in  Real-Time  and  Retrospective  Use  of  Video.  Telemedicine 
and  e-Health,  12(2):  204,  T1E1.  2006 


DAMD-1 7-03-2-0001 
Page  38 


■  Hu  P,  Xiao  Y,  Ho  D,  Mackenzie  CF,  Hu  H,  Voigt  R,  Martz  D.  Advanced 
Visualization  Platform  for  Surgical  Operating  Room  Coordination:  Distributed 
Video  Board  System.  Surgical  Innovation.  13(2):  129-135.  2006 

■  Hu  PF,  Burlbaugh  M,  Xiao  Y,  Mackenzie  CF,  Voigt  R,  Brooks  T,  Fraser  L, 
Connolly  MR,  Herring  T.  Video  Infrastructure  and  Application  Design  Methods 
for  an  OR  of  the  Future.  Telemedicine  and  e-Health.  1 1(2),  211,  T3C2.  2005 

■  Hu  PF,  Hu  H,  Seagull  JF,  Mackenzie  CF,  Voigt  R,  Martz  D,  Dutton  R,  Xiao  Y. 
Distributed  Video  Board:  Advanced  Telecommunication  System  for  Opearation 
Room  Coordination.  Telemedicine  and  e-Health.  11(2),  248,  P28.  2005 

■  Hu  PF,  Xiao  Y,  Mackenzie  CF,  Seagull  FJ,  Brooks  T,  LaMonte  MP,  &  Gagliano 
D.  Many  to  One  to  Many  Telemedicine  Architecture  and  Applications. 
Telemedicine  Journal  and  e-Health.  10(Supplement  1),  S-39.  2004 

■  Kim  Y-J,  Xiao  Y,  Hu  P,  Dutton  RP.  Staff  Acceptance  of  Video  Monitoring  for 
Coordination:  A  Video  System  to  Support  Perioperative  Situation  Awareness. 
Journal  of  Clinical  Nursing  (accepted).  2007 

■  Kim  Y-J,  Xiao  Y,  Hu  P,  Dutton  RP.  Staff  Acceptance  of  Video  Monitoring  for 
Coordination:  A  Video  System  to  Support  Perioperative  Situation  Awareness. 
Journal  of  Clinical  Nursing  (accepted).  2007 

■  Dandekar  and  R.  Shekhar,  "FPGA-accelerated  Deformable  Registration  for 
Improved  Target-delineation  During  CT-guided  Interventions,"  IEEE 
Transactions  on  Biomedical  Circuits  and  Systems,  vol.  1(2),  pp.  116-127,  2007. 

■  Dandekar,  C.  Castro-Pareja,  and  R.  Shekhar,  "FPGA-based  real-time  3D  image 
preprocessing  for  image-guided  medical  interventions,"  Journal  of  Real-Time 
Image  Processing,  vol.  1(4),  pp.  285-301,  2007. 

■  Dandekar,  K.  Siddiqui,  V.  Walimbe,  and  R.  Shekhar,  "Image  registration 
accuracy  with  low-dose  CT:  how  low  can  we  go?,"  in  3rd  IEEE  International 
Symposium  on  Biomedical  Imaging:  Nano  to  Macro,  2006,  pp.  502-505. 

■  Dandekar,  V.  Walimbe,  and  R.  Shekhar,  "Hardware  Implementation  of 
Hierarchical  Volume  Subdivision-based  Elastic  Registration"  in  28th  Annual 
International  Conference  of  the  IEEE:  Engineering  in  Medicine  and  Biology 
Society,  2006,  pp.  1425-1428. 

■  O.  Dandekar,  W.  Plishker,  S.  Bhattacharyya,  and  R.  Shekahr,  "Multiobjective 
Optimization  of  FPGA-Based  Medical  Image  Registration"  IEEE  Symposium  on 
Field-Programmable  Custom  Computing  Machines,  Under  Review,  2008. 

■  R.  Shekhar,  O.  Dandekar,  S.  Kavic,  I.  George,  R.  Mezrich,  and  A.  Park, 
"Development  of  continuous  CT-guided  minimally  invasive  surgery,"  Multimedia 
Meets  Virtual  Reality  (MMVR),  2007. 

■  R.  Shekhar,  O.  Dandekar,  S.  Kavic,  I.  George,  R.  Mezrich,  and  A.  Park, 
"Development  of  continuous  CT-guided  minimally  invasive  surgery,"  Proc  SPIE, 
Medical  Imaging  2007. 

■  Seagull  FJ,  Xiao  Y,  &  Plasters  C.  Information  Accuracy  and  Sampling  Effort:  A 
Field  Study  of  Surgical  Scheduling  Coordination.  IEEE  Transactions  on  Systems, 
Man,  and  Cybernetics,  Part  A:Systems  and  Humans.  24(6),  764-771.  2004 

■  Xiao  Y,  Dexter  F,  Hu  FP,  Dutton  R.  Usage  of  Distributed  Displays  of  Operating 
Room  Video  when  Real-Time  Occupancy  Status  was  Available  .  Anesthesia  and 
Analgesia  2008;  106(2):554-560.  2008 


DAMD-1 7-03-2-0001 
Page  39 


■  Xiao  Y,  Hu  P,  Hu  H,  Ho  D,  Dexter  F,  Mackenzie  CF,  Seagull  FJ,  Dutton  D.  An 
algorithm  for  processing  vital  sign  monitoring  data  to  remotely  identify  operating 
room  occupancy  in  real-time.  Anesthesia  &  Analgesia, (101)3:823-829 . 2005 

■  Xiao  Y,  Hu  P,  Moss  J,  de  Winter  J,  Venekamp  D,  Mackenzie  CF,  Seagull  FJ, 
Perkins  S.  Opportunities  and  challenges  in  improving  surgical  work  flow. 
Cognition,  Technology  &  Work,  accepted.  2007 

■  Xiao  Y,  Schimpff  S,  Mackenzie  CF,  Merrell  R,  Entin  E,  Voigt  R,  Jarrell  B.  Video 
Technology  to  Advance  Safety  in  the  Operating  Room  and  Perioperative 
Environment.  Surgical  Innovation.  14(1):  52-61.  2007 

■  Xiao  Y,  Schimpff  S,  Mackenzie  CF,  Merrell  R,  Entin  E,  Voigt  R,  Jarrell  B.  Video 
Technology  to  Advance  Safety  in  the  Operating  Room  and  Perioperative 
Environment.  Surgical  Innovation.  14(1):  52-61.  2007 

■  Xiao  Y,  Strader  M,  Hu  P,  Wasei  M,  Wieringa  P.  Visualization  Techniques  for 
Collaborative  Trajectory  Management .  ACM  Conference  on  Human  Factors  in 
Computing  Systems,  pp.1547  -  1552.  2006 

■  Xiao  Y,  Wasei  M,  Hu  P,  Wieringa  P,  Dexter  F.  Dynamic  Management  in 
Perioperative  Processes:  A  Modeling  and  Visualization  Paradigm.  12th  IF  AC 
Symposium  on  Information  Control  Problems  in  Manufacturing.  (3)647-52.  2006 


DAMD-1 7-03-2-0001 
Page  40 


Appendices 


A.  An  image  registration-based  approach  for  continuous  volumetric  CT-guided 
interventions.  Accepted  for  presentation  at  the  2009  Computer  Assisted  Radiology  and 
Surgery  (CARS)  Conference.  R.  Shekhar,  A.  Prithviraj.  [Abstract] 

B. .  High-speed  reconstruction  of  low-dose  CT  using  GP-GPU.  Accepted  for  presentation 
at  the  2008  IEEE  Nuclear  Science  Symposium  and  Medical  Imaging  Conference.  V. 
Bhat,  R.  Shekhar  [Extended  Abstract] 

C.  Multiobjective  optimization  for  reconfigurable  implementation  of  medical  image 
registration.  Accepted  by  International  Journal  of  Reconfigurable  Computing.  Dandekar 
O,  Plishker  W,  Bhattacharyya  SS,  Shekhar  R  [Preprint] 

D.  Augmented  Reality  for  Laparoscopic  Surgery  Using  a  Novel  Imaging  Method  -  Initial 
Results  from  a  Porcine  Animal  Model.  R  Shekhar,  C  Godinez,  S  Kavic,  E  Sutton,  V 
Bhat,  O  Dandekar,  I  George,  A  Park.  To  be  presented  at  2009  Society  of  American 
Gastrointestinal  and  Endoscopic  Surgeons  Conference.  [Abstract] 

E  Novel,  Web-Based,  Information-Exploration  Approach  for  Improving  Operating 
Room  Logistics  and  System  Processes 

F.  A  Research  Portfolio  for  Innovation  in  the  Surgical  Environment 

G.  Methodological  Infrastructure  in  Surgical  Ergonomics:  A  Review  of  Tasks,  Models, 
and  Measurement  Systems 

H.  Ergonomic  risk  associated  with  assisting  in  minimally  invasive  surgery 

I.  Endoscopic  Video  Texture  Mapping  on  Pre-Built  3D  Anatomical  Objects  with  Camera 
Tracking 

J.  Model  Based  Estimation  and  Visualization  of  Global  Object  Deformations  for 
laparoscopic  Renal  Cryoablation 

K.  The  Maryland  Virtual  Patient 

L.  Surgical  Ergonomics  in  Minimally  Invasive  Surgery 


DAMD-1 7-03-2-0001 
Page  41 


An  image  registration-based  approach  for  continuous  volumetric  CT-guided 
interventions 


Raj  Shekhar,  Ananthranga  Prithviraj 

Department  of  Diagnostic  Radiology,  University  of  Maryland  School  of  Medicine 


Purpose 

Minimally  invasive  image-guided  interventions  (IGIs)  that  include  biopsies,  ablations,  and  surgeries  are 
less  than  optimal  because  of  the  unavailability  of  continuous  three-dimensional  (3D)  visualization  of  the 
anatomy.  When  continuous  or  real-time,  intraoperative  imaging  remains  two-dimensional  as  in 
conventional  and  computed  tomography  (CT)  fluoroscopy,  ultrasound,  magnetic  resonance  (MR) 
imaging,  and  endoscopy.  When  three-dimensional  (3D),  as  in  volumetric  CT  and  MR  imaging,  the 
imaging  remains  temporally  discrete  and  the  resulting  image  guidance  is  stop-and-go  and  inefficient. 

Recent  advances  in  multidetector  CT  (MDCT)  are  beginning  to  permit  continuous  3D  imaging  during  an 
IGI.  With  the  latest  MDCT  scanners,  it  is  now  possible  to  scan  up  to  10-12-cm  thick  regions  of  the 
anatomy  at  an  extremely  high  spatial  resolution  multiple  times  per  second.  But  radiation  exposure 
concerns  and  inability  to  visualize  the  vasculature  (contrast  agents  cannot  be  administered 
continuously)  limit  clinical  implementation  of  continuous  3D  CT,  despite  being  technically  feasible. 

We  present  here  a  novel  concept  based  on  high-speed  3D  image  registration  that  addresses  both  these 
problems.  We  acquire  a  single  contrast-enhanced  volumetric  CT  scan  (called  initial  CT)  at  a  diagnostic 
dose  at  the  start  of  the  IGI.  Subsequently,  CT  is  operated  at  a  low  dose  without  contrast.  A  diagnostic- 
quality  contrast-enhanced  image  of  the  operative  field  is  obtained  by  rapidly  and  nonrigidly  registering 
the  initial  CT  with  intraoperative  low-dose  CT.  We  present  here  the  feasibility  of  our  concept  in  terms  of 
registration  time  and  accuracy,  savings  in  radiation  dose,  and  intraoperative  vessel  visualization. 

Methods 

Our  imaging  specimen  was  a  swine  prepared  for  a  mock  laparoscopic  liver  surgery  under  experimental 
CT  guidance.  All  CT  images  were  acquired  using  a  64-slice  CT  scanner  (Philips  Brilliance-64)  following 
pneumoperitoneum.  Before  imaging,  4  markers  (2-4  mm  guidewire  pieces)  were  implanted  in  the  liver 
parenchyma  and  2  2.3-mm  calcium  markers  sutured  onto  the  liver  surface  for  objective  validation  of 
image  registration.  The  initial  CT  was  a  helical  CT  scan  of  the  liver  (53-cm  axial  coverage)  at  normal 
breathing  with  arterial  phase  enhancement  at  a  diagnostic  dose  (250  mAs  tube  current).  The  swine  was 
then  overventilated  to  accentuate  liver  motion  and  deformation  from  the  time  of  initial  CT.  CT  scanning, 
simulating  intraoperative  imaging,  was  then  performed  at  high,  medium,  and  low  doses  (200,  75,  and  25 
mAs,  respectively)  to  determine  the  lower  limit  of  low-dose  CT.  For  all  3  doses,  this  CT  was  performed  in 
2  modes:  helical  and  axial.  The  helical  mode  allowed  complete  coverage  of  the  liver  but  provided  only  a 
snapshot  of  it.  The  axial  mode  could  acquire  repeated  scans  (we  acquired  100)  at  0.9  Hz,  but  the  4-cm 
axial  coverage  of  the  CT  scanner  used  permitted  only  partial  liver  coverage.  Using  a  previously  reported, 
hardware-accelerated  implementation  of  nonrigid  image  registration,  we  registered  the  initial  CT  with 
each  of  the  3  helical  CT  scans  and  each  of  the  100  scans  in  the  3  axial  CT  scan  sequences.  The  initial 
relative  position  of  image  pairs  was  based  on  slice  location  data  saved  with  images.  Using  the  implanted 
markers,  the  initial  and  postregistration  misalignments  (reflective  of  registration  accuracy)  were 
computed.  The  time  of  registration  was  also  recorded. 
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Results 


For  3  helical  scans,  approximately  3-mm  initial  misalignment  reduced  to  approximately  1.5  mm  for  all  3 
doses  after  nonrigid  registration  of  initial  CT  with  intraoperative  CT  (Table  1).  The  results  show  that  the 
dose  had  virtually  no  effect  on  registration  accuracy,  indicating  that  intraoperative  CT  could  be 
performed  at  25  mAs.  Most  structural  mismatches  before  registration  were  removed  after  registration 
(Figure  1).  In  Figure  2,  volume  rendering  of  the  original  intraoperative  CT  and  registered  initial  CT 
(representing  intraoperative  anatomy)  is  shown.  Note  that  using  our  concept,  the  vasculature  can  be 
visualized  throughout  an  IGI  without  having  to  administer  CT  contrast.  A  similar  registration  was 
performed  between  initial  CT  and  axial  scan  sequences  at  the  3  doses.  The  mean  initial  and 
postregistration  misalignments  (averaged  over  100  scans)  are  shown  in  Table  2.  The  nonrigid 
registration  exhibited  acceptable  accuracy  despite  small  coverage.  These  results,  too,  suggest  that 
intraoperative  CT  at  25  mAs  is  acceptable.  The  mean  registration  time  for  larger  helical  data  sets  was 
430  s  and  for  smaller  axial  scans  was  63  s. 

Table  1.  Image  misalignment  before  and  after  registration  for  helical  scans. 


Dose  (mAs) 

Initial  Misalignment 
(mm) 

Misalignment  after 
Registration  (mm) 

200  (high) 

3.12 

1.47 

75  (medium) 

3.63 

1.67 

25  (low) 

3.25 

1.45 

Table  2.  Average  image  misalignment  before  and  after  registration  for  axial  scans. 


Dose  (mAs) 

Mean  Initial 
Misalignment  (mm) 

Mean  Misalignment 
after  Registration 
(mm) 

200 

4.40 

1.05 

75 

4.51 

1.12 

25 

4.69 

1.21 

Figure  1.  Superposition  of  initial  CT  and  intraoperative  CT  (acquired  at  200  mAs)  before  (left)  and  after 
(right)  nonrigid  registration.  Note  better  structure  alignment  after  registration. 
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Figure  2.  Volume  rendering  of  original  intraoperative  CT  and  registered  initial  CT.  Note  that  the  latter 

shows  the  vasculature. 


Conclusions 

We  have  presented  proof-of-concept  results  using  high-speed  image  registration  to  substitute 
noncontrast  low-dose  intraoperative  CT  images  with  a  modified  contrast-enhanced  diagnostic-quality 
initial  CT  image  during  an  IGI.  The  latter  contains  contrast  enhanced  structures  for  which  visualization 
may  be  critical  during  an  IGI.  Our  strategy  also  led  to  a  10-fold  savings  in  radiation  dose  (tube  current 
reduced  from  250  mAs  to  25  mAs).  25  mAs  was  the  lowest  setting  on  the  CT  scanner  used,  suggesting 
that  further  dose  savings  are  possible  if  the  CT  scanner  could  be  operated  at  a  lower  dose.  Finally,  the 
residual  misregistration  on  the  order  of  1  mm  is  acceptable,  because  the  targeting  uncertainty  in  most 
IGIs  is  currently  much  greater.  It  should  also  be  noted  that  the  concept  extends  to  any  preoperative 
image  and  is  likely  to  vary  depending  on  specific  imaging  needs  of  an  IGI.  Ours  was  an  offline  feasibility 
study.  Its  clinical  implementation  will  require  further  speed  improvement  of  CT  reconstruction  and 
image  registration.  Overall,  we  have  presented  a  novel  concept  along  with  demonstration  of  its 
feasibility  that  promises  to  enable  use  of  the  latest  MDCT  for  continuous  3D  visualization  at  acceptably 
low  doses  during  most  IGIs. 
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HIGH-SPEED  RECONSTRUCTION  OF  LOW-DOSE  CT  USING  GP-GPU 

Venkatesh  Bhat,  Raj  Shekhar 


ABSTRACT 

Minimally  invasive  image-guided  Interventions(IGIs)  lead  to  improved  treatment  outcomes  while 
significantly  reducing  patient  trauma  and  recovery  time.  Ultrasound  and  fluoroscopy  have  been 
traditionally  used  for  image  guidance.  But  these  modalities  do  not  provide  a  comprehensive  3D  view  of  the 
anatomy.  Because  of  features  such  as  fast  scanning,  high  spatial  resolution,  3D  view  and  ease  of  operation, 
CT  is  increasingly  being  used  to  navigate  IGIs.  The  risk  of  radiation  exposure,  however,  limits  its  current 
and  future  use. 

We  perform  ultralow-dose  scanning  to  overcome  this  limitation.  To  address  the  image  quality  problem 
with  ultralow-dose  CT,  we  reconstruct  images  using  iterative  Paraboloidal  Surrogate(PS)  algorithm.  As 
iterative  techniques  are  generally  compute  intensive,  we  have  accelerated  the  PS  algorithm  on  a  GPU. 
Here,  we  first  compare  the  quality  of  the  low-dose  images  reconstructed  using  the  PS  algorithm  and  the 
standard  filtered-back  projection(FBP)  algorithm.  Using  actual  scanner  data,  we  achieved  visually 
acceptable  improvement  in  the  quality  of  reconstructed  images  using  the  iterative  algorithm. 

We  further  demonstrate  a  fast  implementation  of  the  Ordered  Subsets  version  of  the  PS  algorithm  for 
axial  scans  on  an  NVIDIA  8800  GTX  GPU  using  CUDA(Compute  Unified  Device  Architecture).  Several 
studies  in  the  recent  past  have  reported  computing  forward  and  back  projection  on  GPU  using  the 
rasterization  framework.  However  the  GP-GPU(General  Purpose  GPU)  framework  used  in  the 
implementation,  being  more  generic,  accommodated  a  wide  variety  of  penalty  functions  on  the  GPU  and 
thus  obviated  the  need  to  transfer  data  between  the  GPU  and  CPU  during  reconstruction. 

We  have  compared  the  GPU  implementation  using  the  ray- tracing  method  to  a  similar  implementation 
on  the  CPU  and  the  traditional  CPU  implementation  using  a  weight  matrix.  We  demonstrate  two  orders  of 
improvement  in  speed  while  the  image  quality  remains  comparable  to  the  traditional  CPU  implementation. 
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High-Speed  Reconstruction  of  Low-Dose  CT  Using  GP-GPU 


I.  OVERVIEW 

As  discussed  in  the  abstract,  optimal  use  of  CT  for  IGIs 
mandates  the  acquisition  of  data  at  low  radiation  doses. 
Although  it  is  widely  acknowledged  that  iterative  algorithms 
are  better  suited  for  reconstruction  of  noisy  data,  it  has  not 
been  proven  that  these  algorithms  lead  to  better  quality 
images  from  low-dose  CT  scans  as  compared  to  the  more 
common  FBP  algorithm.  In  this  study  we  acquire  low 
radiation  dose  CT  scans  and  compare  the  images 
reconstructed  using  the  FBP  and  PS  algorithms.  These 
results  are  shown  in  section  II. 

It  is  a  well  known  fact  that  iterative  algorithms  are 
computationally  intensive  as  compared  to  the  FBP 
algorithm.  In  Section  III,  we  demonstrate  a  ray-tracing 
based  method  for  fast  implementation  of  the  PS  algorithm 
for  axial  scans.  We  implement  this  method  on  the  CPU  as 
well  as  the  GPU  and  compare  the  speedups  and  the  image 
reconstruction  quality  in  section  IV. 

II.  LOW-DOSE  CT  RECONSTRUCTION 

In  this  study,  we  acquired  axial  scans  using  a  64-slice 
Philips  Brilliance  CT  scanner  at  the  minimum  permissible 
tube  current  of  25  mA.  We  acquired  the  scanner 
preprocessed  data  after  angular  rebinning  to  obtain  parallel 
beam  sinograms.  We  then  used  the  FBP  and  the  PS 
algorithms  [1]  to  reconstruct  the  data.  Every  slice  was 
reconstructed  to  1024  x  1024  voxels  from  400  views  with 
1498  detectors  each.  The  results  are  as  shown  in  Fig.  1. 

It  is  clear  from  the  images  that  the  PS  algorithm  provides 
higher  quality  reconstructed  images  in  comparison  to  the 
FBP  algorithm  for  the  low  radiation  dose  scans. 


Fig  1.  a)Image  reconstructed  using  FBP  algorithm  (top), 
b)  Image  after  10  iterations  of  PS  algorithm. 


III.  FAST  IMPLEMENTATION  OF  PS  ALGORITHM 

In  the  past,  results  have  been  presented  on  the  use  of 
graphics  processors  for  CT  image  reconstruction.  In  [2]  a 
general  framework  for  the  use  of  GPU  in  reconstruction 
algorithms  was  presented.  This  was  mainly  based  on  the  use 
of  the  graphics  pipeline  for  acceleration  of  forward  and  back 
projection  steps.  In  [3]  the  GPU  was  used  to  accelerate  these 
steps  of  the  convex  algorithm  using  the  framework 
suggested  in  [2].  Other  algorithms  such  as  SART  and 
OSEM  have  also  been  accelerated  using  similar 
frameworks.  All  of  these  implementations  relied  on 
languages  such  as  OpenGL  and  other  shading  languages  that 
prevented  direct  programming  of  the  GPU  for  various  kinds 
of  mathematical  operations. 

In  [4],  the  FDK  algorithm  was  accelerated  for  3D  cone 
beam  geometry  using  CUD  A.  We  accelerate  the  PS 
algorithm  for  parallel  beam  data  from  axial  scans  using 
CUDA.  The  generic  PS  algorithm  [1]  mainly  consists  of  3 
steps:  1)  forward  projection  2)  back  projection  3)  pixel 
update.  For  the  GPU  based  implementation,  we  reconstruct 
4  slices  at  a  time  to  make  optimal  use  of  the  4  channels 
(RGB A)  in  the  texture  memory.  We  also  make  use  of  the 
hardware  based  bilinear  interpolators  in  the  texture  memory 
for  the  interpolation  purposes.  For  the  CPU,  a  software 
implementation  of  the  bilinear  interpolator  is  used. 


Fig  2.  The  forward  projection  mechanism  based  on  ray 
tracing. 

The  CUDA  forward  projection  kernel  consists  of  a  2D 
grid  of  blocks  of  size  1 6  x  16,  where  each  thread  represents 
a  ray.  The  image  from  the  previous  iteration  is  loaded  into 
the  texture  memory.  A  single  warp  per  block  is  used  to 
calculate  the  sine  and  cosine  values  for  the  rotation  matrix 
which  is  stored  in  the  local  shared  memory.  The  incident 
rays  are  rotated  according  to  the  sine  and  cosine  values  and 
every  thread  sums  up  the  bilinearly  interpolated  pixel  values 
along  the  uniformly  sampled  points  on  the  rays  to  obtain  the 
new  sinogram.  Fig  2  demonstrates  the  forward  projection 
idea. 

For  the  back  projection,  the  accumulated  data  for  every  view 
(i.e.,  each  row  of  the  sinogram)  is  duplicated,  leading  to  the 
creation  of  an  ‘extended’  sinogram.  This  ‘extended 
sinogram’  is  loaded  into  the  texture  memory.  The  kernel 
consists  of  16  x  16  blocks  where  each  thread  represents  a 
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pixel  on  the  resulting  image.  The  image  is 
rotated  and  the  value  of  each  pixel  location 
is  shifted  vertically  such  that  it  lies  between 
the  2  rows  of  the  ‘extended  sinogram’  for  the 
corresponding  view.  Bilinear  interpolation  is 
then  used  to  obtain  the  pixel  value. 

Thus  each  iteration  consists  of  one 
call  to  the  ‘forward  projection  kernel’  and 
one  call  to  the  ‘back  projection  kernel’  apart 
from  one  call  to  the  ‘pixel  update  kernel’ 
that  calculates  the  penalty  function  and 
updates  the  image.  The  forward  and  back 
projection  kernels  are  preceded  by  a  single 
transfer  of  data  to  the  texture  memory  from 
the  main  GPU  memory. 

IV.  RESULTS 


We  created  synthetic  projections  from  a  clinical  phantom 
image.  The  projections  were  then  used  to  reconstruct  the 
phantom  image  using  three  methods: 

a)  An  unmodified  implementation  of  the  PS  algorithm  on 
the  CPU  using  pre-computed  weights. 

b)  The  ray- tracing  version  of  the  PS  algorithm  on  the  CPU. 

c)  The  ray- tracing  version  of  the  PS  algorithm  on  the  GPU. 
The  reconstructed  images  were  compared  using  PSNR  as  a 
measure  of  comparison  and  the  original  phantom  image  as 
the  benchmark. 


OSEM 

Subsets 

CPU  (sec) 
(Pre-Computed) 

CPU  (sec) 
(Ray  Tracing) 

GPU(s) 

(Ray  Tracing) 

5 

242 

6.66 

0.03596 

10 

244 

6.68 

0.05418 

Table  1:  Comparison  of  time  per  iteration  on  CPU(using 
pre-computed  weights  and  ray  tracing  algorithms)  and  GPU 
for  a  256  x 256  image  slice  with  400  views.  This  includes  all 
communication  overhead  except  the  one  time  latency  for 
transfer  of  sinogram  from  the  scanner.  On  the  GPU,  the 
time  taken  for  reconstruction  of  1,2,3  or  4  slices  is  the  same. 

CPU  (Pre-Computed  Weights)  vs.  GPU  (Ray  Tracing)  Image  Quality 


V.  CONCLUSION 

From  table  1,  it  is  clear  that  a  speedup  of  about  30X  is 
obtained  by  the  ray-tracing  algorithm  and  related 
algorithmic  improvements.  A  further  acceleration  of  150- 
200X  is  obtained  by  the  parallel  implementation  using 
CUD  A  on  the  GPU.  Fig.  4  compares  the  quality  of  the 
reconstructed  images  on  the  GPU  using  ray  tracing  method 
and  the  CPU  using  accurate  Pre-computed  weight  matrix. 
The  PSNR  curves  for  the  GPU  closely  follow  the  same  for 
the  CPU  based  implementation.  It  is  clear  that  the  use  of  the 
discrete  ray  tracing  method  and  single  precision 
computations  on  the  GPU  do  not  significantly  impact  the 
quality  of  the  reconstructed  images.  The  convergence  rate 
improves  with  the  use  of  suitable  penalty  functions  as 
described  in  [1].  In  this  implementation  no  penalty  function 
was  used  in  order  to  compare  the  native  algorithms. 

REFERENCES 

1)  H.  Erdogan,  J.  Fesslar,  “Monotonic  Algorithms  for 
Transmission  Tomography”,  IEEE  Trans.  Med.  Imaging, 
Sept  1999,  pp.  801-814. 

2)  F.  Xu  and  K.  Muller.”A  Unified  framework  for  Rapid  3D 
Computed  Tomography  on  Commodity  GPUs”,  IEEE 
Medical  Imaging  Conference, 2003. 

3)  J.S.  Kole  et  al.,  “Evaluation  of  accelerated  iterative  X-ray 
CT  image  reconstruction  using  floating  point  graphics 
hardware”,  IEEE  Nuclear  Science  Symp.,2004.. 

4)  H.  Scherl  et  al.,  “Fast  GPU-Based  CT  Reconstruction 
using  the  Common  Unified  Device  Architecture  (CUDA)”, 
IEEE  Nuclear  Science  Symp.,  2007 
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In  the  field  of  real-time  signal  processing,  a  single  application  often  has  multiple  computationally  intensive  kernels  that  can 
benefit  from  acceleration  from  custom  or  reconfigurable  hardware  platforms,  such  as  field-programmable  gate  arrays  (FPGAs).  For 
adaptive  utilization  of  resources  at  run  time,  FPGAs  with  capabilities  for  dynamic  reconfiguration  are  emerging.  To  exploit  run¬ 
time  reconfiguration,  especially  in  the  presence  of  unknown  or  dynamically  varying  resource  availability,  it  is  useful  for  designers 
to  derive  sets  of  efficient  design  configurations  that  trade  off  application  performance  with  fabric  resources.  Such  sets  can  then 
be  maintained  at  run  time  so  that  the  best  available  design  tradeoff  is  used  based  on  run-time  operating  conditions.  However, 
finding  a  single- optimized  configuration  is  known  to  be  a  difficult  problem,  and  generating  a  family  of  optimized  configurations 
suitable  for  different  run-time  scenarios  is  even  more  difficult  and  relatively  understudied.  Toward  this  end,  this  paper  presents  a 
novel  multiobjective  wordlength  optimization  strategy  developed  in  the  context  of  FPGA-based  implementation  of  a  representative 
computationally  intensive  image  processing  application,  medical  image  registration.  The  tradeoff  between  FPGA  resources  (area 
and  memory)  and  implementation  accuracy  is  explored,  and  Pareto -optimized  wordlength  configurations  are  systematically 
identified.  Within  this  framework,  we  also  compare  several  search  methods  for  finding  Pareto-optimized  design  configurations  and 
demonstrate  the  applicability  of  search  based  on  evolutionary  techniques  for  efficiently  identifying  superior  multiobjective  tradeoff 
curves  in  the  context  of  the  chosen  problem.  Furthermore,  through  postsynthesis  validation,  we  demonstrate  the  feasibility  of  such 
a  framework  in  the  context  of  FPGA-based  medical  image  registration.  With  additional  work,  this  optimization  strategy  may  also 
be  adapted  to  a  wide  range  of  signal  processing  applications,  including  applications  for  processing  various  kinds  of  image  and 
video  signals. 

Copyright  ©  2008  Omkar  Dandekar  et  al.  This  is  an  open  access  article  distributed  under  the  Creative  Commons  Attribution 
License,  which  permits  unrestricted  use,  distribution,  and  reproduction  in  any  medium,  provided  the  original  work  is  properly 
cited. 


1.  Introduction 

|  1  |  In  the  field  of  real-time  signal  processing  systems,  accelera- 

I - 1  tion  of  computationally  intensive  algorithmic  components  is 

L— J  often  achieved  by  mapping  them  to  custom  or  reconfigurable 
|  3  |  hardware  platforms,  such  as  field-programmable  gate  arrays 
(FPGAs).  Often,  multiple  kernels  in  a  single  application 
can  benefit  from  this  approach  to  acceleration,  requiring 
them  to  share  a  single  fabric.  This  is  particularly  necessary 
in  applications  where  multiple  kernels  share  data  and  feed 
results  to  each  other.  For  example,  in  medical  imaging 
it  has  been  shown  that  both  image  preprocessing  [1-3] 
and  image  registration  [4-6]  can  achieve  high  levels  of 
speedup  through  hardware  acceleration.  To  maximize  the 


performance  of  an  application  and  to  optimize  the  fabric 
resource  utilization,  the  kernels  must  be  designed  to  meet 
their  application  requirements  while  balancing  their  resource 
consumption  on  the  fabric.  Application  requirements  often 
change  at  run  time  and  strategies  based  on  static  design 
must  try  to  identify  a  reasonable  “average  case”  design 
configuration  that  accommodates  all  possible  scenarios. 
Because  this  approach  can  be  highly  suboptimal  and  can 
result  in  significant  under-  or  overutilization  of  the  fabric  in 
many  scenarios,  modern  FPGAs  are  emerging  with  run-time 
reconfiguration  capabilities.  Self-monitoring  FPGA  imple¬ 
mentations  are  able  to  adapt  to  variable  application  require¬ 
ments  and  reconfigure  their  processing  structures  to  better- 
suited  design  configurations  [7].  This  not  only  improves 
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application  performance  but  also  results  in  more  effective 
utilization  of  fabric  resources.  To  exploit  this  technology,  it 
is  highly  desirable  that  the  designers  provide  quality  design 
configurations  that  trade  off  application  performance  with 
fabric  resources.  Consequently,  the  primary  focus  of  this 
work  is  to  develop  a  framework  that  enables  the  designers 
to  identify  such  optimized  design  configurations. 

A  common  system  parameter  for  trading  off  resource  and 
performance  is  datapath  wordlength.  Typically,  algorithms 
are  first  developed  in  software  using  floating-point  represen¬ 
tation  and  later  migrated  to  hardware  using  finite  precision 
(e.g.,  fixed-point  representation)  for  achieving  improved 
computational  speed  and  reduced  hardware  cost.  These 
implementations  are  often  parameterized,  so  that  a  wide 
range  of  finite  precision  representations  can  be  supported 
[8]  by  choosing  an  appropriate  wordlength  for  each  internal 
variable.  As  a  consequence,  the  accuracy  and  hardware 
resource  requirements  of  such  a  system  are  functions  of  the 
wordlengths  used  to  represent  the  internal  variables.  Deter¬ 
mining  an  optimal  wordlength  configuration  that  minimizes 
the  hardware  implementation  cost  while  satisfying  a  design 
criterion  such  as  maximum  output  error  has  been  shown 
to  be  nondeterministic  polynomial-time  (NP)-hard  [9]  and 
can  take  up  to  50%  of  the  design  time  for  complex  systems 
[10].  In  addition,  a  single  optimal  solution  may  not  exist, 
especially  in  the  presence  of  multiple  conflicting  objectives. 
Moreover,  a  new  configuration  generally  must  be  derived 
when  the  design  constraints  are  altered. 

An  optimum  wordlength  configuration  can  be  identified 
by  analytically  solving  the  quantization  error  equation 
as  described  previously  by  several  authors  [11-15].  This 
analytical  representation,  however,  can  be  difficult  to  obtain 
for  complex  systems.  Techniques  based  on  local  search 
or  gradient-based  search  [16]  have  also  been  employed, 
but  these  methods  are  limited  to  finding  a  single  feasible 
solution  as  opposed  to  an  optimized  tradeoff  curve.  An 
exhaustive  search  of  the  entire  design  space  is  guaranteed 
to  find  Pareto- optimal  configurations.  Execution  time  for 
such  exhaustive  search,  however,  increases  exponentially 
with  the  number  of  design  parameters,  making  it  unfeasible 
for  most  practical  systems.  Methods  that  transform  this 
problem  into  a  linear  programing  problem  have  also 
been  reported  [11],  but  these  techniques  are  limited  to 
cases  in  which  the  objectives  can  be  modeled  as  linear 
functions  of  the  design  parameters.  Other  approaches  based 
on  linear  aggregation  of  objectives  may  not  find  proper 
Pareto -optimal  solutions  when  the  search  space  is  nonconvex 
[17].  Techniques  based  on  evolutionary  methods  have  been 
shown  to  be  effective  in  searching  large  search  spaces  in 
an  efficient  manner  [18,  19].  Furthermore,  these  techniques 
are  inherently  capable  of  performing  multipoint  searches. 
As  a  result,  techniques  based  on  evolutionary  algorithms 
(EAs)  have  been  employed  in  the  context  of  multiobjective 
optimization  (SPEA2  [20],  NSGA-II  [21]).  However,  their 
application  to  solving  wordlength  optimization  problems  has 
been  limited. 

We  formulate  this  problem  of  finding  optimal 
wordlength  configurations  as  a  multiobjective  optimization, 
where  different  objectives — for  example,  accuracy  and 
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area — generally  conflict  with  one  another.  Although  this 
approach  increases  the  complexity  of  the  search,  it  can 
find  a  set  of  Pareto- optimized  configurations  representing 
strategically  chosen  tradeoffs  among  the  various  objectives. 
This  allows  a  designer  to  choose  an  efficient  configuration 
that  satisfies  given  design  constraints  and  provides  ease 
and  flexibility  in  modifying  the  design  configuration  as 
the  constraints  change.  In  this  work,  we  present  this  novel 
multiobjective  optimization  strategy  and  demonstrate  its 
feasibility  in  the  context  of  FPGA-based  implementation 
of  medical  image  registration.  The  tradeoff  between 
FPGA  resources  (area  and  memory)  and  implementation 
accuracy  is  systematically  explored,  and  Pareto -optimized 
solutions  are  identified.  This  analysis  is  performed  by 
treating  the  wordlengths  of  the  internal  variables  as  design 
variables.  We  also  compare  several  search  methods  for 
finding  Pareto -optimized  solutions  and  demonstrate,  in  the 
context  of  the  chosen  problem,  the  applicability  of  search 
based  on  evolutionary  techniques  for  efficiently  identifying 
superior  multiobjective  tradeoff  curves.  In  comparison 
with  the  earlier  reported  techniques,  our  work  captures 
more  comprehensively  the  complexity  of  the  underlying 
multiobjective  optimization  problem  and  demonstrates 
the  applicability  of  our  framework  in  finding  superior 
Pareto -optimized  solutions  in  an  efficient  manner,  even  in 
the  presence  of  a  nonlinear  objective  function. 

This  paper  is  organized  as  follows.  Section  2  provides 
background  on  image  registration  and  outlines  an  architec¬ 
ture  for  its  FPGA-based  implementation.  We  also  highlight 
some  strategies  for  parameterized  design  and  synthesis  of 
this  architecture.  Formulations  for  multiobjective  optimiza¬ 
tion  and  various  search  methods  to  find  Pareto -optimized 
solutions  are  described  in  Section  3.  Section  4  describes 
experimental  results,  compares  various  search  methods,  and 
presents  postsynthesis  validation  of  the  presented  strategy.  In 
Section  5,  discussion  on  wordlength  search  and  multiobjec¬ 
tive  optimization  is  presented.  Section  6  concludes  the  paper. 


2.  Image  Registration 


Medical  image  registration  is  the  process  of  aligning  two 
images  that  represent  the  same  anatomy  at  different  times, 
from  different  viewing  angles,  or  using  different  imaging 
modalities.  Image  registration  is  an  active  area  of  research 
and  over  the  last  several  decades  numerous  publications 
have  outlined  various  methodologies  to  perform  image 
registration  and  its  applications.  Maintz  and  Viergever  [22] 
and  Hill  et  al.  [23]  have  presented  a  comprehensive  summary 
of  the  range  of  the  image  registration  domain.  Several 
types  of  image  registration  are  in  routine  use  (see  [22-25]); 
however,  registration  based  on  voxel  intensities  remains  the 
most  versatile,  powerful,  and  inherently  automatic  way  of 
achieving  the  alignment  between  two  images.  This  approach, 
in  general,  attempts  to  find  the  transformation  (T)  that 
optimally  aligns  a  reference  image  (RI)  with  coordinates  x,  y, 
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and  z  and  a  floating  image  (FI)  under  an  image  similarity 
measure  (F): 

f  =  argmax  F(RI(x,y,z),  FI(T(x,y,z))).  (1) 

T 

Many  image  similarity  measures,  such  as  the  sum  of  squared 
differences  and  cross -correlation,  have  been  used,  but  over 
the  last  decade  mutual  information  (MI)  has  emerged  as  the 
preferred  similarity  measure.  MI  is  an  information  theoretic 
measure  and  is  calculated  as 

MI(RI,  FI)  =  fc(RI)  +  fc(FI)  -  h( RI,  FI).  (2) 

In  this  equation,  h{ RI)  and  h{ FI)  are  the  individual  entropies 
and  h( RI,  FI)  is  the  mutual  entropy  of  the  images  to  be 
registered.  These  entropies  are  further  calculated  as: 

H RI)  =  -  X  pwW- In  (pRi(x)), 

h(¥l)  =  “  X  PfiW-  In  (pFiW),  (3) 

h(Rl,¥l)  =  -  X  X  PwjiW'  ln  (Pri,fiW)- 

Here,  the  notations  Rri,  pm,  and  Rri,fi  represent  the  individ¬ 
ual  probability  distribution  function  (PDF)  of  RI,  individual 
PDF  of  FI,  and  the  mutual  PDFs  of  RI  and  FI,  respectively. 
These  distributions  are  estimated  from  the  individual  and 
mutual  histograms  of  the  images  to  be  registered.  Additional 
details  about  computation  of  MI,  its  properties,  and  its 
application  to  image  registration  can  be  found  in  an  article 
by  Pluim  et  al.  [25]. 

Mi-based  image  registration  has  been  shown  to  be 
robust  and  effective  in  multimodality  image  registration 
[24].  However,  this  form  of  registration  typically  requires 
thousands  of  iterations  (MI  evaluations),  depending  on 
image  complexity  and  the  degree  of  initial  misalignment 
between  images.  Castro-Pareja  et  al.  [4]  have  shown  that 
calculation  of  MI  for  different  candidate  transformations 
is  a  factor  limiting  the  performance  of  Mi-based  image 
registration.  We  have,  therefore,  developed  an  FPGA-based 
architecture  for  accelerated  calculation  of  MI  [6]  that  is 
capable  of  computing  MI  40  times  faster  than  software 
implementation.  The  transformation  model  (T  in  (1)) 
employed  by  this  architecture  is  a  locally  rigid-body  model 
consisting  of  three  dimensional  translations  and  rotations. 
Consequently,  the  analysis  presented  in  this  article  pertains 
to  locally  rigid  transformations.  However,  it  must  be  noted 
that  hierarchical  rigid-body  transformations  can  be  used 
to  represent  deformable  (nonrigid)  transformation  models 
as  demonstrated  in  the  volume  subdivision-based  approach 
reported  by  Walimbe  and  Shekhar  [26]. 

2.1.  FPGA-Based  Implementation  of  Mutual 
Information  Calculation 

|  4  |  During  the  execution  of  image  registration  using  this 
architecture,  the  optimization  process  is  executed  from  a  host 
workstation.  The  host  provides  a  candidate  transformation, 
while  the  FPGA-based  implementation  applies  it  to  the 
images  and  performs  the  corresponding  MI  computation. 
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The  computed  MI  value  is  then  further  used  by  the  host  to 
update  the  candidate  transformation  and  eventually  find  the 
optimal  alignment  between  the  RI  and  FI.  Figure  1  shows  the 
top-level  block  diagram  of  the  aforementioned  architecture. 
The  important  modules  in  this  design  are  described  in  the 
following  sections. 

2.1.1.  Voxel  Counter 

Calculation  of  MI  requires  processing  (fetching  the  voxel 
from  the  image  memory,  performing  coordinate  transforma¬ 
tion,  and  updating  mutual  histogram  (MH)  )  each  voxel 
in  the  RI.  In  addition,  because  the  implemented  algorithm 
processes  the  images  on  a  subvolume  basis,  RI  voxels  within  a 
3D  neighborhood  corresponding  to  an  individual  subvolume 
must  be  processed  sequentially.  The  host  programs  the 
FPGA-based  MI  calculator  with  subvolume  start  and  end 
addresses,  and  the  voxel  counter  computes  the  address 
corresponding  to  each  voxel  within  that  subvolume  in  z-y-x 
order. 

2.1.2.  Coordinate  Transformation 

The  initial  step  in  MI  calculation  involves  applying  a 
candidate  transformation  (T)  to  each  voxel  coordinate  (vr) 
in  the  RI  to  find  the  corresponding  voxel  coordinates  in  the 
FI  (vy).  This  is  mathematically  expressed  as 

Vf  =  T-vr.  (4) 

The  deformation  model  employed  is  a  six-parameter  rigid 
transformation  model  and  is  represented  using  a  4  X  4 
matrix.  The  host  calculates  this  matrix  based  on  the  current 
candidate  transformation  provided  by  the  optimization 
routine  and  sends  it  to  the  MI  calculator.  A  fixed-point 
representation  is  used  to  store  the  individual  elements  of  this 
matrix.  The  coordinate  transformation  is  accomplished  by  a 
simple  matrix  multiplication. 

2.1.3.  Partial  Volume  Interpolation 

The  coordinates  mapped  in  the  FI  space  (vy)  do  not  normally 
coincide  with  a  grid  point  (integer  location),  thus  requiring 
interpolation.  Nearest  neighbor  and  trilinear  interpolation 
schemes  have  been  used  most  often  for  this  purpose;  how¬ 
ever,  partial  volume  (PV)  interpolation,  introduced  by  Maes 
et  al.  [24],  has  been  shown  to  provide  smooth  changes  in 
the  histogram  values  with  small  changes  in  transformation. 
The  reported  architecture  consequently  implements  PV 
interpolation  as  the  choice  of  interpolation  scheme,  vy,  in 
general,  will  have  both  fractional  and  integer  components 
and  will  land  within  an  FI  neighborhood  of  size  2x2x2. 
The  interpolation  weights  required  for  the  PV  interpolation 
are  calculated  using  the  fractional  components  of  vy.  Fixed- 
point  arithmetic  is  used  to  compute  these  interpolation 
weights.  The  corresponding  floating  voxel  intensities  are 
fetched  by  the  image  controller  in  parallel  using  the  integer 
components  of  vy.  The  image  controller  also  fetches  the  voxel 
intensity  corresponding  to  vr.  The  MH  then  must  be  updated 
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Figure  1:  Top-level  block  diagram  of  FPGA-based  architecture  for  MI  calculation. 


for  each  pair  of  reference  and  floating  voxel  intensities  (eight 
in  all)  using  the  corresponding  weights  computed  by  the  PV 
interpolator. 


2.1.4.  Image  Memory  Access 

|  5  |  Images  from  different  modalities  (computed  tomography 
(CT),  magnetic  resonance  imaging  (MRI),  positron  emission 
tomography  (PET),  etc.)  have  different  native  resolution, 
typical  image  dimensions,  and  dynamic  range.  Despite  these 
variations,  dimensions  of  most  medical  images  are  smaller 
than  512x512x512  and  are  supported  by  our  architecture. 
The  dynamic  range  of  these  images  is  indicated  by  the 
number  of  bits  used  to  represent  the  intensity  ( jgray  value)  at 
every  voxel.  For  Ml-based  registration,  however,  these  images 
are  typically  converted  to  7-  or  8 -bit  representation  as  a  part 
of  image  preprocessing.  This  is  done  to  prevent  dispersion  of 
the  MFF  and  leads  to  improved  quality  of  image  registration. 
After  this  preprocessing  step,  all  the  gray  values  in  the  images 
are  used  for  image  registration. 

The  typical  size  of  3D  medical  images  prevents  the  use  of 
high-speed  memory  internal  to  the  FPGA  for  their  storage. 
Between  the  two  images,  the  RI  has  more  relaxed  access 
requirements  because  it  is  accessed  in  a  sequential  manner 
(in  z-y-x  order).  This  kind  of  access  benefits  from  burst 
accesses  and  memory  caching  techniques,  allowing  the  use 
of  modern  dynamic  random  access  memories  (DRAMs)  for 
image  storage.  For  the  architecture  presented,  both  the  RI 
and  FI  are  stored  in  separate  logical  partitions  of  the  same 
DRAM  module.  Because  the  access  to  the  RI  is  sequential  and 
predictable,  the  architecture  uses  internal  memory  to  cache  a 
block  of  RI  voxels.  Thus,  during  the  processing  of  that  block 


of  RI  voxels,  the  image  controller  has  parallel  access  to  both 
RI  and  FI  voxels.  The  RI  voxels  are  fetched  from  the  internal 
FPGA  memory,  whereas  the  FI  voxels  are  fetched  directly 
from  the  external  memory. 

The  FI,  however,  must  be  accessed  randomly  (depending 
on  the  current  transformation  T),  and  eight  FI  voxels  (a 
2x2x2  neighborhood)  must  be  fetched  for  every  RI 
image  voxel  to  be  processed.  To  meet  this  memory  access 
requirement,  the  reported  architecture  employs  a  memory 
addressing  scheme  similar  to  the  cubic  addressing  technique 
reported  in  the  context  of  volume  rendering  [4,  27]  and 
image  registration  [4,  27].  A  salient  feature  of  this  technique 
is  that  it  allows  simultaneous  access  to  the  entire  2x2x2 
voxel  neighborhood.  The  reported  architecture  implements 
this  technique  by  storing  four  copies  of  the  FI  and  taking 
advantage  of  the  burst  mode  accesses  native  to  modern 
DRAMs.  The  image  voxels  are  arranged  sequentially  such 
that  performing  a  size-two  burst  fetches  two  adjacent  2x2 
neighborhood  planes,  thus  making  the  entire  neighborhood 
available  simultaneously.  The  image  intensities  of  this  neigh¬ 
borhood  are  then  further  used  for  updating  the  MFF. 


2.1.5.  Updating  the  Mutual  Histogram 


For  a  given  RI  voxel  (RV)  and  associated  3D  neighborhood 
in  the  FI  (FV0  :  FV7),  there  are  eight  intensity  pairs 
(RV,FV o  :  FV7)  and  corresponding  interpolation  weights. 
Because  the  MFF  must  be  updated  (read-modify- write) 
at  these  eight  locations,  this  amounts  to  16  accesses  to 
MFF  memory  for  each  RI  voxel.  This  high  memory  access 
requirement  is  handled  by  using  the  high-speed,  dual-ported 
memories  internal  to  the  FPGA  to  store  the  MFF.  The 
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operation  of  updating  the  MH  is  pipelined  and,  hence, 
read- after- write  (RAW)  hazards  can  arise  if  consecutive 
transactions  attempt  to  update  identical  locations  within  the 
MH.  The  reported  design  addresses  this  issue  by  introducing 
preaccumulate  buffers,  which  aggregate  the  weights  from  all 
conflicting  transactions.  Thus,  all  the  transactions  leading  to 
an  RAW  hazard  are  converted  into  a  single  update  to  the  MH, 
thereby  eliminating  any  RAW  hazards. 

While  the  MH  is  being  computed,  the  individual  his¬ 
togram  accumulator  unit  computes  the  histograms  for  the 
RI  and  FI.  These  individual  histograms  are  also  stored 
using  internal,  dual-ported  memories.  The  valid  voxel 
countermodule  keeps  track  of  the  number  of  valid  voxels 
accumulated  in  the  MH  and  calculates  its  reciprocal  value. 
The  reciprocal  value  of  the  number  of  valid  voxels  in 
the  histogram  is  calculated  by  using  successive  subtraction 
operations.  This  operation  takes  N  clock  cycles  (where  N 
is  the  fractional  wordlength  of  the  reciprocal  value)  and 
must  be  performed  only  once  per  every  MI  calculation.  The 
resulting  value  is  then  used  by  the  entropy  calculation  unit 
for  calculating  the  individual  and  joint  probabilities  and 
subsequently  entropies  as  described  in  (3). 

2.1.6.  Entropy  Calculation 

The  final  step  in  MI  calculation  is  to  compute  joint  and  indi¬ 
vidual  entropies  using  the  joint  and  individual  probabilities, 
respectively.  To  calculate  entropy,  it  is  necessary  to  evaluate 
the  function  f(p)  =  p-\n(p)  for  all  the  probabilities.  As 
each  probability  P  takes  on  values  within  [0,1],  the 
corresponding  range  for  the  function  f(p)  is  [~e~1, 0].  Thus, 
f(p)  has  a  finite  dynamic  range  and  is  defined  for  all  values 
ofp.  Several  methods  for  calculating  logarithmic  functions  in 
hardware  have  been  reported  [28] ,  but  of  particular  interest  is 
the  multiple  lookup  table  (LUT) -based  approach  introduced 
by  Castro-Pareja  and  Shekhar  [5].  This  approach  minimizes 
the  error  in  representing  f(p)  for  a  given  number  and  size 
of  LUTs  and,  hence,  is  accurate  and  efficient.  Following 
this  approach,  the  reported  design  implements  f(p)  using 
multiple  LUT-based  piecewise  polynomial  approximation. 

2.2.  Parameterized  Architectural  Design 

Implementations  of  signal  processing  algorithms  using 
microprocessor-  or  DSP-based  approaches  are  characterized 
by  a  fixed  datapath  width.  This  width  is  determined  by  the 
hardwired  datapath  of  the  underlying  processor  architec¬ 
ture.  Reconfigurable  implementation  based  on  FPGAs,  in 
contrast,  allows  the  size  of  datapath  to  be  customized  to 
achieve  better  tradeoffs  among  accuracy,  area,  and  power. 
Moreover,  this  customization  can  also  change  at  run  time 
to  accommodate  varying  design  requirements.  The  use  of 
such  custom  data  representation  for  optimizing  designs  is 
one  of  the  main  strengths  of  reconfigurable  computing  [29]. 
It  has  been  contended  that  the  most  efficient  hardware 
implementation  of  an  algorithm  is  the  one  that  supports  a 
variety  of  finite  precision  representations  of  different  sizes 
for  its  internal  variables  [8].  In  this  spirit,  many  commercial 
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and  research  efforts  have  employed  parameterized  design 
style  for  intellectual  property  (IP)  cores  [30-34].  This 
parameterization  capability  not  only  facilitates  reuse  of 
design  cores  but  also  allows  them  to  be  reconfigured  to  meet 
design  requirements. 

During  the  design  of  the  aforementioned  architecture, 
we  adopted  a  similar  design  style  that  allows  configuration 
of  the  wordlengths  of  the  internal  variables.  Hardware 
design  languages  such  as  VHDL  and  Verilog  natively  support 
hierarchical  parameterization  of  a  design  through  use  of 
generics  and  parameters ,  respectively.  This  design  style  takes 
advantage  of  these  language  features  and  is  employed  for  the 
design  of  all  the  modules  described  earlier.  We  highlight  the 
main  features  of  this  design  style  using  illustrative  examples. 
Consider  a  design  module  with  two  input  variables  that 
compute  an  output  variable  through  arithmetic  manipula¬ 
tion  of  the  input  variables.  The  wordlength  of  the  input 
variables  (denoted  by  IP  1  .WIDTH,  IP2 .WIDTH)  and  that 
of  the  output  variable  (denoted  by  OP  .WIDTH)  are  the 
design  parameters  for  this  module.  The  module  can  then 
be  parameterized  for  these  design  variables  as  illustrated  in 
Figure  2(a). 

In  a  pipelined  implementation  of  an  operation,  a  module 
may  have  multiple  internal  pipeline  stages  and  correspond¬ 
ing  intermediate  variables.  Wordlengths  chosen  for  these 
intermediate  variables  can  also  impact  the  accuracy  and 
hardware  requirements  of  a  design.  In  our  implementation 
scheme,  we  do  not  employ  any  rounding  or  truncation  for 
the  intermediate  variables,  but  deduce  their  wordlengths 
based  on  the  wordlengths  of  the  input  operands  and  the 
arithmetic  operation  to  be  implemented.  For  example, 
multiplication  of  two  8 -bit  variables  will,  at  the  most,  require 
a  16-bit-wide  intermediate  output  variable.  A  parameterized 
implementation  of  this  scenario  is  illustrated  in  Figure  2(c). 
Sometimes  it  is  also  necessary  to  instantiate  a  vendor- 
provided  or  a  third-party  IP  core,  such  as  a  first  in/first 
out  (FIFO)  module  or  an  arithmetic  unit,  within  a  design 
module.  In  such  cases,  we  simply  pass  the  wordlength 
parameters  down  the  design  hierarchy  to  configure  the  IP 
core  appropriately  and  thereby  maintain  the  parameterized 
design  style  (see,  e.g.,  Figure  2(b)). 

When  signals  cross  module  boundaries,  the  output 
wordlength  and  format  (position  of  the  binary  point)  of 
the  source  module  should  match  the  input  wordlength  and 
format  of  the  destination  module.  This  is  usually  achieved 
through  use  of  a  rounding  strategy  and  right-  or  left- 
shifting  of  the  signals.  Adopting  “rounding  toward  the 
nearest”  strategy  to  achieve  wordlength  matching  is  expected 
to  introduce  the  smallest  error  but  requires  additional 
logic  resources.  In  our  design,  we  therefore  implement 
truncation  (or  “rounding  toward  zero”  strategy),  while  the 
signal  shifting  is  achieved  through  zero  padding.  Both  these 
operations  are  parameterized  and  take  into  account  the 
wordlengths  and  the  format  at  the  module  boundaries 
(see,  e.g.,  Figure  2(c)).  Thus,  this  parameterized  design  style 
enables  the  architecture  to  support  multiple  wordlength 
configurations  for  its  internal  variables. 
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--Declaration  of  a  parameterized  entity 
entity  Modulel  is 
generic  ( 

IP1_WIDTH:  INTEGER; 

IP2.WIDTH:  INTEGER; 

0P_WIDTH  :  INTEGER)  ; 
port  ( 

si  :  IN  STD_LOGIC_VECTOR  (IPl.WIDTH-l  DOWNTO  0); 
s2  :  IN  STD_L0GIC_VECT0R  (IP2.WIDTH-1  DOWNTO  0); 
ol  :  OUT  STD_L0GIC_VECT0R  (0P_WIDTH-1  DOWNTO  0)); 
end  Modulel; 


(a) 


--Instantiation  of  a  vendor  supplied 
--IP-core,  scfifo.  The  width  of  the 
--FIFO  is  set  to  be  equal  to  that  of 
--the  signal  to  be  buffered  (ol) . 
fifol  :  scfifo 
generic  map  ( 

LPM_NUMW0RDS  =>  (2**L0G2_DEPTH)  , 
LPM.WIDTH  =>  OP.WIDTH, 

LPM.WIDTHU  =>  L0G2.DEPTH) 
port  map  ( 
data  =>  ol, . . .); 


(b) 


--Declaration  of  an  intermediate  variable  with  appropriate  wordlength 
signal  il  :  STD_L0GIC_VECT0R  (IP1_WIDTH+IP2_WIDTH-1  DOWNTO  0); 
il  <=  si  *  s2;  -  -Arithmetic  operation  (multiplication) 

--Truncation  of  LSBs :  Performed  if  (IP1_WIDTH+IP2_WIDTH)  >=  0P_WIDTH 
ol  <=  il(IPl_WIDTH+IP2_WIDTH-l  DOWNTO  IP1_WIDTH+IP2_WIDTH-0P_WIDTH)  ; 

--Signal-shifting:  Performed  if  IP1_WIDTH+IP2_WIDTH)  <  0P_WIDTH 
ol  <=  il  &  C0NV_STD_L0GIC_VECT0R(0 , 0P_WIDTH-(IP1_WIDTH+IP2_WIDTH) )  ; 


(c) 


Figure  2:  Parameterized  architectural  design:  (a)  declaration  of  a  parameterized  entity;  (b)  an  example  instantiation  of  a  vendor- supplied 
IP-core;  (c)  usage  of  parameterized  internal  variables  and  an  example  of  truncation  and  signal  shifting,  performed  at  the  module  boundaries. 


3.  Multiobjective  Optimization 

The  aforementioned  architecture  is  designed  to  accelerate  the 
calculation  of  MI  for  performing  medical  image  registration. 
We  have  demonstrated  this  architecture  to  be  capable  of 
offering  execution  performance  superior  to  that  of  a  software 
implementation  [6].  The  accuracy  of  MI  calculation  (and, 
by  extension,  that  of  image  registration)  offered  by  this 
implementation,  however,  is  a  function  of  the  wordlengths 
chosen  for  the  internal  variables  of  the  design.  Similarly, 
these  wordlengths  also  control  the  hardware  implementation 
cost  of  the  design.  For  medical  applications,  the  ability  of 
an  implementation  to  achieve  the  desired  level  of  accuracy 


is  of  paramount  importance.  It  is,  therefore,  necessary  to 
understand  the  tradeoff  between  accuracy  and  hardware 
implementation  cost  for  a  design  and  to  identify  wordlength 
configurations  that  provide  effective  tradeoffs  between  these 
conflicting  criteria.  This  multiobjective  optimization  allows 
a  designer  to  systematically  maximize  accuracy  for  a  given 
hardware  cost  limitation  (e.g.,  imposed  by  a  target  device) 
or  minimize  hardware  resources  to  meet  the  accuracy 
requirements  of  a  medical  application. 

ction  3.1  provides  a  formal  definition  of  this  problem 
Section  3.2  describes  a  framework  developed  for 
multiobjective  optimization  of  FPGA-based  medical  image 
registration. 
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3.1.  Problem  Statement 

Consider  a  system  Q  that  is  parameterized  by  N  parameters 
rii  (i  =  1,2 ,  ...,1V),  where  each  parameter  can  take  on  a 
single  value  from  a  corresponding  set  of  valid  values  (vz).  Let 
the  design  configuration  space  corresponding  to  this  system 
be  S,  which  is  defined  by  a  set  consisting  of  all  N-tuples 
generated  by  the  Cartesian  product  of  the  sets  vz-  V  i: 

S  =  Vi  x  v2  x  v3  x  ■  ■  ■  x  vN.  (5) 

The  size  of  this  design  configuration  space  is  then  equal  to 
the  cardinality  of  the  set  S  or,  in  other  words,  the  product  of 
the  cardinalities  of  the  sets  vf-: 

\S\  =  |  vi  |  x  |  v2 1  x  |  v3 1  x  ■  ■  ■  x  |  | .  (6) 

For  most  systems,  not  all  configurations  that  belong  to  S  may 
be  valid  or  practical.  We,  therefore,  define  a  subset  3  (3  E  S), 
such  that  it  contains  all  the  feasible  system  configurations. 
Now,  consider  m  objective  functions  (fu  fz>  •  •  •  ■>  fm)  defined 
for  system  Q,  such  that  each  function  associates  a  real  value 
for  every  feasible  configuration  c  e  3. 

The  problem  of  multiobjective  optimization  is  then  to 
find  a  set  of  solutions  that  simultaneously  optimizes  the  m 
objective  functions  according  to  an  appropriate  criterion. 
The  most  commonly  adopted  notion  of  optimality  in 
multiobjective  optimization  is  that  of  Pareto  optimality. 
According  to  this  notion,  a  solution  c*  is  Pareto  optimal  if 
there  does  not  exist  another  solution  c  E  3  such  that  fi(c)  < 
fi(c*)9  for  all  z,  and  fj(c )  <  fj(c*),  for  at  least  one  j.  The 
solution  c*  is  also  called  a  nondominated  solution  because 
no  other  solution  dominates  (or  is  superior  to)  solution  c*  as 
per  the  Pareto -optimality  criteria.  The  set  of  Pareto -optimal 
solutions,  therefore,  includes  all  nondominated  solutions. 

Given  a  multiobjective  optimization  problem  and  a 
heuristic  technique  for  this  problem  that  attempts  to  derive 
Pareto -optimal  or  near  Pareto -optimal  solutions,  we  refer 
to  solutions  derived  by  the  heuristic  as  “Pareto -optimized” 
solutions. 

3.2.  Multiobjective  Optimization  Framework 

Figure  4  illustrates  the  framework  that  we  have  developed 
for  multiobjective  optimization  of  the  aforementioned  archi¬ 
tecture.  The  framework  has  two  basic  components.  The 
first  is  the  search  algorithm  that  explores  the  design  space 
and  generates  feasible  candidate  solutions;  the  second  is 
the  objective  function  evaluation  module  that  evaluates 
candidate  solutions.  The  solutions  and  associated  objective 
values  are  fed  back  to  the  search  algorithm  so  that  they 
can  be  used  to  refine  the  search.  These  two  components 
are  loosely  coupled  so  that  different  search  algorithms  can 
be  easily  incorporated  into  the  framework.  Moreover,  the 
objective  function  evaluation  module  is  parallelized  using  a 
message  passing  interface  (MPI)  on  a  32-processor  cluster. 
With  this  parallel  implementation,  multiple  solutions  can  be 
evaluated  in  parallel,  thereby  increasing  search  performance. 
These  components  are  described  in  detail  in  the  following 
sections. 


3.2.1.  Design  Parameters 

As  described  in  Section  3.2,  the  architecture  performs 
MI  calculation  using  a  fixed- point  datapath.  As  a  result, 
the  accuracy  of  MI  calculation  depends  on  the  precision 
(wordlength)  offered  by  this  datapath.  The  design  parame¬ 
ters  in  this  datapath  define  the  design  space  and  are  identified 
and  listed  along  with  the  corresponding  design  module  (see 
Figure  1)  in  Table  1. 

A  fixed-point  representation  consists  of  an  integer  part 
and  a  fractional  part.  The  numbers  of  bits  assigned  to 
these  two  parts  are  called  the  integer  wordlength  (IWL)  and 
fractional  wordlength  (FWL),  respectively.  The  individual 
numbers  of  bits  allocated  to  these  parts  control  the  range 
and  precision  of  the  fixed-point  representation.  For  this 
architecture,  the  IWL  required  for  each  design  parameter 
can  be  deduced  from  the  range  information  specific  to  the 
image  registration  application.  For  example,  in  order  to 
support  translations  in  the  range  of  [-64, 63]  voxels,  7  bits  of 
IWL  (with  1  bit  assigned  as  a  sign  bit)  are  required  for  the 
translation  parameter.  We  used  similar  range  information 
to  choose  the  IWL  for  all  the  parameters,  and  these  values 
are  reported  in  Table  1.  The  precision  required  for  each 
parameter,  which  is  determined  by  its  FWL,  is  not  known 
a  priori.  We,  therefore,  determine  this  by  performing  multi¬ 
objective  optimization  using  the  FWL  of  each  parameter  as 
a  design  variable.  In  our  experiments,  we  used  the  design 
range  of  [1,  32]  bits  for  FWLs  of  all  the  parameters.  The 
optimization  framework  can  support  different  wordlength 
ranges  for  different  parameters,  which  can  be  used  to  account 
for  additional  design  constraints,  such  as,  for  example, 
certain  kinds  of  constraints  imposed  by  third-party  IP. 

The  entropy  calculation  module  is  implemented  using 
a  multiple  2§m  -based  approach  and  also  employs  fixed- 
point  arithmetic.  However,  this  module  has  already  been 
optimized  for  accuracy  and  hardware  resources,  as  described 
previously  [5].  The  optimization  strategy  employed  in  this 
earlier  work  uses  an  analytical  approach  that  is  specific 
to  entropy  calculation  and  is  distinct  from  the  strategy 
presented  in  this  work.  This  module,  therefore,  does  not 
participate  in  the  multiobjective  optimization  framework  of 
this  paper,  and  we  simply  use  the  optimized  configuration 
identified  earlier.  This  further  demonstrates  the  flexibility 
of  our  optimization  framework  to  accommodate  arbitrary 
designer-  or  externally  optimized  modules. 


3.2.2.  Search  Algorithms 

An  exhaustive  search  that  explores  the  entire  design  space 
is  guaranteed  to  find  all  Pareto -optimal  solutions.  However, 
this  search  can  lead  to  unreasonable  execution  time,  espe¬ 
cially  when  the  objective  function  evaluation  is  computation¬ 
ally  intensive.  For  example,  with  four  design  variables,  each 
taking  one  of  32  possible  values,  the  design  space  consists 
of  324  solutions.  If  the  objective  function  evaluation  takes  1 
minute  per  trial  (which  is  quite  realistic  for  multiple  MI  cal¬ 
culation  using  large  images),  the  exhaustive  search  will  take 
2  years.  Even  with  the  32-processor  cluster  that  we  employed 
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Table  1:  Design  variables  for  FPGA-based  architecture.  Integer  wordlengths  are  determined  based  on  application-specific  range  information, 
and  fractional  wordlengths  are  used  as  parameters  in  the  multiobjective  optimization  framework. 


Architectural  module 

Design  variable 

Integer  wordlength  (IWL)  (bits) 

Fractional  wordlength  (FWL)  range  (bits) 

Voxel  coordinate  transformation 

Translation  vector 

7 

[1,32] 

Rotation  matrix 

4 

[1,32] 

Partial  volume  interpolation 

Floating  image  address 

9 

[1,32] 

Mutual  histogram  accumulation 

Mutual  histogram  bin 

25 

[1,32] 

Table  2:  Number  of  solutions  explored  by  search  methods. 


Search  method 

Number  of  solutions  explored 

Partial  search 

65  536 

Random  search 

6000 

EA-based  search 

6000 

and  assuming  linear  speedup,  exhaustive  search  for  a  four- 
variable  system  will  require  about  3.5  weeks.  This  highlights 
the  infeasibility  of  exhaustive  search  even  for  a  system  with 
relatively  small  number  of  design  variables.  Consequently, 
we  have  considered  alternative  search  methods,  as  described 
below. 

The  first  method  is  partial  search ,  which  explores  only 
a  portion  of  the  entire  design  space.  For  every  design 
variable,  the  number  of  possible  values  it  can  take  is  reduced 
by  half  by  choosing  every  alternate  value.  A  complete 
search  is  then  performed  in  this  reduced  search  space.  This 
method,  although  not  exhaustive,  can  effectively  sample  the 
breadth  of  the  design  space.  The  second  method  is  random 
search ,  which  involves  randomly  generating  a  fixed  number 
of  feasible  solutions.  For  both  of  these  methods,  Pareto- 
optimized  solutions  are  identified  from  the  set  of  solutions 
explored. 

The  third  method  is  performing  a  search  using  evolu¬ 
tionary  techniques.  EAs  have  been  shown  to  be  effective 
in  efficiently  exploring  large  search  spaces  [18,  19].  In 
particular,  we  have  employed  SPEA2  [20],  which  is  quite 
effective  in  sampling  from  along  an  entire  Pareto -optimal 
front  and  distributing  the  solutions  generated  relatively 
evenly  over  the  optimal  tradeoff  surface.  Moreover,  SPEA2 
incorporates  a  fine-grained  fitness  assignment  strategy  and 
an  enhanced  archive  truncation  method,  which  further  assist 
in  finding  Pareto -optimal  solutions.  The  flow  of  operations 
in  this  search  algorithm  is  shown  in  Figure  4. 

For  the  EA-based  search  algorithm,  the  representation  of 
the  system  configuration  is  mapped  onto  a  “chromosome” 
whose  “genes”  define  the  wordlength  parameters  of  the 
system.  Each  gene,  corresponding  to  the  wordlength  of  a 
design  variable  i,  is  represented  using  an  integer  allele  that 
can  take  values  from  the  set  v/,  described  earlier.  Thus,  every 
gene  is  confined  to  wordlength  values  that  are  predefined  and 
feasible  for  a  given  design  variable.  The  genetic  operators 
for  cross-over  and  mutation  are  also  designed  to  adhere  to 
this  constraint  and  always  produce  values  from  set  v,-,  for 
a  gene  i  within  a  chromosome.  This  representation  scheme 
is  both  symmetric  and  repair-free  and,  hence,  is  favored  by 


the  schema  theory  [35]  and  is  computationally  efficient,  as 
described  by  Kianzad  and  Bhattacharyya  [36]. 


3.2.3.  Objective  Function  Models  and 
their  Fidelity 

Search  for  Pareto -optimized  configurations  requires  evaluat¬ 
ing  candidate  solutions  and  determining  Pareto -dominance 
relationships  between  them.  This  can  be  achieved  by  calcu¬ 
lating  objective  functions  for  all  the  candidate  solutions  and 
by  relative  ordering  of  the  solutions  with  respect  to  the  values 
of  their  corresponding  objective  functions.  We  consider  the 
error  in  MI  calculation  and  the  hardware  implementation 
cost  to  be  the  conflicting  objectives  that  must  be  minimized 
for  our  FPGA  implementation  problem.  We  model  the  FPGA 
implementation  cost  using  two  components:  the  first  is  the 
amount  of  logic  resources  (number  of  LUTs)  required  by 
the  design  and  the  second  is  the  internal  memory  consumed 
by  the  design.  We  treat  these  as  independent  objectives 
in  order  to  explore  the  synergistic  effects  between  these 
complementary  resources.  Because  of  the  size  of  the  design 
space  and  limitations  resulting  from  execution  time,  it  is 
not  practical  to  synthesize  and  evaluate  each  solution.  We, 
therefore,  employ  models  for  calculating  objective  functions 
to  evaluate  the  solutions.  The  quality  of  the  Pareto -optimized 
solutions  will  then  depend  on  the  fidelity  of  these  objective 
function  models. 

The  error  in  MI  calculation  can  be  computed  by 
comparing  the  MI  value  reported  by  the  limited-precision 
FPGA  implementation  against  that  calculated  by  a  double¬ 
precision  software  implementation.  For  this  purpose,  we 
have  utilized  a  bit-true  emulator  of  the  hardware.  This  emu¬ 
lator  was  developed  in  C++  and  uses  fixed-point  arithmetic 
to  accurately  represent  the  behavior  of  the  limited-precision 
hardware.  It  supports  multiple  wordlengths  for  internal 
variables  and  is  capable  of  accurately  calculating  the  MI 
value  corresponding  to  any  feasible  configuration.  We  have 
verified  its  equivalence  with  the  hardware  implementation 
for  a  range  of  configurations  and  image  transformations. 
This  emulator  was  used  to  compute  the  MI  calculation  error. 
The  MI  calculation  error  was  averaged  for  three  distinct 
image  pairs  (with  different  image  modality  combinations) 
and  for  50  randomly  generated  image  transformations.  The 
same  sets  of  image  pairs  and  image  transformations  were 
used  for  evaluating  all  feasible  configurations. 

The  memory  required  for  a  configuration  is  primarily 
needed  for  intermediate  FIFOs,  which  are  used  to  buffer 
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internal  variables,  and  the  MH  memory.  For  example,  a  64- 
word-deep  FIFO  used  to  buffer  a  signal  with  a  wordlength 
of  b  will  require  64  X  b  bits  of  memory.  In  our  architec¬ 
ture,  the  depth  of  the  FIFOs  and  the  dimensions  of  the 
MH  are  constant,  whereas  their  corresponding  widths  are 
determined  by  the  wordlength  of  the  design  parameters. 
Using  these  insights,  we  have  developed  an  architecture- 
specific  analytical  expression  that  accurately  represents  the 
cumulative  amount  of  memory  required  for  all  internal 
FIFOs  and  MH.  We  used  this  expression  to  calculate  the 
memory  requirement  of  a  configuration. 

For  estimating  the  area  requirements  of  a  configuration, 
we  adopt  the  area  models  reported  by  Constantinides  et  al. 
[11,  37].  These  are  high-level  models  of  common  functional 
units  such  as  adders,  multipliers,  and  delays.  These  models 
are  derived  from  the  knowledge  of  the  internal  architecture 
of  these  components.  Area  cost  for  interconnects  and  routing 
is  not  taken  into  account  in  this  analysis.  These  models 
have  been  verified  for  the  Xilinx  Virtex  series  of  FPGAs  and 
are  equally  applicable  to  alternative  FPGA  families  and  for 
application-specific  integrated  circuit  (ASIC)  implementa¬ 
tions.  These  models  have  also  been  previously  used  in  the 
context  of  wordlength  optimization  [11,  37,  38]. 

We  further  evaluated  the  fidelity  [39]  of  these  area 
models  using  a  representative  module,  PV  interpolator,  from 
the  aforementioned  architecture.  This  module  receives  the 
fractional  components  of  the  FI  address  and  computes 
corresponding  interpolation  weights.  We  varied  the  FWL 
of  the  FI  address  from  1  to  32  bits  and  synthesized  the 
module  using  the  Altera  Stratix  II  and  Xilinx  Virtex  5  as 
target  devices.  For  a  meaningful  comparison,  the  settings 
for  the  analysis,  synthesis,  and  optimization  algorithms  (e.g., 
settings  to  favor  area  or  speed)  for  the  design  tools  (Altera 
Quartus  II  and  Xilinx  ISE)  were  chosen  to  be  compara¬ 
ble.  After  complete  synthesis,  routing,  and  placement,  we 
recorded  the  area  (number  of  LUTs)  consumed  by  the 
synthesized  design.  This  process  was  automated  by  using 
the  Tel  scripting  feature  provided  by  the  design  tools  and 
through  the  parameterized  design  style  described  earlier.  We 
then  compared  the  consumed  area  against  that  predicted  by 
the  adopted  area  models  for  all  FWL  configurations.  The 
results  of  this  experiment  are  presented  in  Figure  3.  These 
results  indicate  that  the  area  estimates  (number  of  LUTs) 
predicted  by  the  model  are  comparable  to  that  obtained 
through  physical  synthesis  for  both  the  target  devices.  Lor 
quantitative  evaluation,  the  fidelity  of  the  area  models  was 
calculated  as  follows: 


where 

F  =  f  1,  if  sign(S;  -  Sj)  =  sign (M;  -  Mj),  ^ 
’’  (0,  otherwise. 

In  this  equation,  the  Afis  represent  the  values  predicted  by 
the  area  models;  the  Szs  represent  the  values  obtained  after 
physical  synthesis.  The  fidelity  of  the  area  models  when 


— Area  models 
-v—  Altera  Stratix  II 
— n—  Xilinx  Virtex  5 

Figure  3:  Comparison  of  the  area  values  predicted  by  the  adopted 
area  models  with  those  obtained  after  physical  synthesis. 


Figure  4:  Framework  for  multiobjective  optimization  of  FPGA- 
based  image  registration. 


evaluated  with  respect  to  the  synthesis  results  obtained  for 
both  Altera  and  Xilinx  devices  was  1,  which  corresponds  to 
maximum  (“perfect”)  fidelity. 

An  interesting  observation  is  that  in  some  cases  the 
high-level  area  models  underestimate  by  as  much  as  25% 
the  number  of  LUTs  required.  This  can  be  explained  by 
the  fact  that  these  models  were  calibrated  using  previous 
generation  devices  [11,  37].  It  must  be,  however,  noted 
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Table  3:  Parameters  used  for  EA-based  search. 


Parameter 

Value 

Population  size 

200 

Number  of  generations 

30 

Cross-over  probability 

1.0 

Mutation  probability 

0.06 

that  the  Pareto -dominance  relationship  between  the  design 
configurations  is  maintained  as  long  as  the  relative  order¬ 
ing  (with  respect  to  an  objective  function  such  as  area) 
between  two  design  configurations  is  preserved.  Using  more 
accurate  area  models  will  certainly  improve  the  absolute 
prediction  of  area  requirements  corresponding  to  a  given 
design  configuration  but,  as  such,  will  not  affect  the  relative 
ordering  of  a  set  of  design  configurations.  Designing  accurate 
area  models  that  take  into  account  the  latest  devices,  cross¬ 
vendor  FPGA  architectures,  special-purpose  computational 
units,  and  various  synthesis  optimizations  is  nevertheless 
important  and  will  be  a  topic  of  a  future  investigation. 
The  perfect  fidelity  we  achieved  for  the  current  area  models 
indicates  that  the  relative  ordering  of  FWL  configurations 
with  respect  to  their  area  requirements  is  consistent  for  the 
model  and  synthesized  designs.  These  results  further  validate 
the  applicability  of  using  the  aforementioned  area  models  for 
multiobjective  optimization. 

4.  Results 

We  performed  multiobjective  optimization  of  the  aforemen¬ 
tioned  architecture  using  the  search  algorithms  outlined  in 
Section  3.  To  account  for  the  effects  of  random  number 
generation,  the  EA-based  search  and  random  search  were 
repeated  five  times  each,  and  the  average  behavior  from 
these  repeated  trials  is  reported.  The  number  of  solutions 
explored  by  each  search  algorithm  in  a  single  run  is  reported 
in  Table  2.  The  execution  time  of  each  search  algorithm  was 
roughly  proportional  to  the  number  of  solutions  explored, 
and  the  objective  function  evaluation  for  each  solution  took 
approximately  1  minute  using  a  single  computing  node.  As 
expected,  the  partial  search  algorithm  explored  the  largest 
number  of  solutions.  The  parameters  used  for  the  EA-based 
search  are  listed  in  Table  3.  These  parameters  were  identified 
experimentally.  For  example,  using  a  population  size  of  100 
yielded  similar  search  results;  however,  the  diversity  of  the 
solutions  found  in  the  objective  space  was  relatively  poor. 
Similarly,  increasing  maximum  number  generations  beyond 
30  did  not  yield  a  significant  improvement  in  the  quality  of 
the  search  solutions.  The  cross-over  and  mutation  operators 
were  chosen  to  be  one-point  cross-over  and  flip  mutator, 
respectively.  For  a  fair  comparison,  the  number  of  solutions 
explored  by  the  random  search  algorithm  was  set  to  be  equal 
to  that  explored  by  the  EA-based  algorithm. 

The  solution  sets  obtained  by  each  search  method 
were  then  further  reduced  to  corresponding  nondominated 
solution  sets  using  the  concept  of  Pareto  optimality.  As 
described  earlier,  the  objectives  considered  for  this  evaluation 
were  the  MI  calculation  error  and  the  memory  and  area 


requirements  of  the  solutions.  Figure  5  shows  the  Pareto - 
optimized  solution  set  obtained  for  each  search  method. 
Qualitatively,  the  Pareto  front  identified  by  the  EA-based 
search  is  denser  and  more  widely  distributed  and  demon¬ 
strates  better  diversity  than  other  search  methods.  Figure  6 
compares  the  Pareto  fronts  obtained  by  partial  search  and 
EA-based  search  by  overlaying  them  and  illustrates  that 
the  EA-based  search  can  identify  better  Pareto -optimized 
solutions,  which  indicates  the  superior  quality  of  solutions 
obtained  by  this  search  method.  Moreover,  it  must  be  noted 
that  the  execution  time  required  for  the  EA-based  search  was 
more  than  10  times  faster  than  that  required  for  the  partial 
search. 


4.1.  Metrics  for  Comparison  of 

Pareto-Optimized  Solution  Sets 

Quantitative  comparison  of  the  Pareto- optimized  solution 
sets  is  essential  in  order  to  compare  more  precisely  the 
effectiveness  of  various  search  methods.  As  with  most  real- 
world  complex  problems,  the  Pareto -optimal  solution  set 
is  unknown  for  this  application.  We,  therefore,  employ  the 
following  two  metrics  to  perform  quantitative  comparison 
between  different  solution  sets.  We  use  the  ratio  of  non¬ 
dominated  individuals  (RNIs)  to  judge  the  quality  of  a  given 
solution  set,  and  the  diversity  of  a  solution  set  is  measured 
using  the  cover  rate.  These  performance  measures  are  similar 
to  those  reported  by  Zitzler  and  Thiele  [40]  and  are  described 
below. 

The  RNI  is  a  metric  that  measures  how  close  a  solution 
set  is  to  the  Pareto -optimal  solution  set.  Consider  two  solu¬ 
tion  sets  (PI  and  P2)  that  each  contain  only  nondominated 
solutions.  Let  the  union  of  these  two  sets  be  P\j.  Furthermore, 
let  Pnd  be  a  set  of  all  nondominated  solutions  in  Pjj  (Pnd  ^ 
Pd).  The  RNI  for  the  solution  set  Pi  is  then  calculated  as 


RNI  i  = 


Pi  Pi  -Pnd 

|  Pnd  | 


(9) 


where  |  ■  |  is  the  cardinality  of  a  set.  The  closer  this  ratio  is  to 
100%,  the  more  superior  the  solution  set  is  and  the  closer  it 
is  to  the  Pareto -optimal  front.  We  computed  this  metric  for 
all  the  search  algorithms  previously  described,  and  the  results 
are  presented  in  Figure  7.  Our  EA-based  search  offers  better 
RNI  and,  hence,  superior  quality  solutions  to  those  achieved 
with  either  the  partial  or  random  search. 

The  cover  rate  estimates  the  spread  and  distribution  (or 
diversity)  of  a  solution  set  in  the  objective  space.  Consider 
the  region  between  the  minimum  and  maximum  of  an 
objective  function  as  being  divided  into  an  arbitrary  number 
of  partitions.  The  cover  rate  is  then  calculated  as  the  ratio 
of  the  number  of  partitions  that  is  covered  (i.e.,  there  exists 
at  least  one  solution  with  an  objective  value  that  falls  within 
a  given  partition)  by  a  solution  set  to  the  total  number 
of  partitions.  The  cover  rate  (CQ  of  a  solution  set  for  an 
objective  function  (/Q  can  then  be  calculated  as 


Ck  = 


Nk 
N  ’ 


(10) 


DAMD-1 7-03-2-0001 
Page  57 


Memory  (bits)  Memory  (bits)  Memory  (bits) 


International  Journal  of  Reconfigurable  Computing 


11 


Figure  5:  Pareto -optimized  solutions  identified  by  various  search 
methods. 
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Figure  6:  Qualitative  comparison  of  solutions  found  by  partial 
search  and  EA-based  search. 


where  Nk  is  the  number  of  covered  partitions  and  N  is  the 
total  number  of  partitions.  If  there  are  multiple  objective 
functions  (e.g.,  m),  then  the  net  cover  rate  can  be  obtained 
by  averaging  the  cover  rates  for  each  objective  function  as 


i  m 

C=  -X  c*. 


an 


m 

m 

s 


The  maximum  cover  rate  is  1  and  the  minimum  value  is 
0.  The  closer  the  cover  rate  of  a  solution  set  is  to  1,  the 
better  coverage  and  more  even  (more  diverse)  distribution 
it  has.  Because  the  Pareto -optimal  front  is  unknown  for 
our  targeted  application,  the  minimum  and  maximum 
values  for  each  objective  function  were  selected  from  the 
solutions  identified  by  all  the  search  methods.  We  used 
20  partitions/decades  for  MI  calculation  error  (represented 
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Figure  7:  Comparison  of  search  methods  using  the  ratio  of  nondominated  individuals  (RNIs). 
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Figure  8:  Comparison  of  search  methods  using  cover  rate. 


using  a  logarithmic  scale),  1  partition  for  every  50LUTs 
for  the  area  requirement,  and  1  partition  for  every  50  Kbits 
of  memory  requirement.  The  cover  rate  for  all  the  search 
algorithms  described  earlier  was  calculated  using  the  method 
outlined  above  and  the  results  are  illustrated  in  Figure  8.  The 
EA-based  search  offers  a  better  cover  rate,  which  translates 
to  better  range  and  diversity  of  solutions  when  compared 
with  either  partial  or  random  searches.  In  summary,  our  EA- 
based  search  outperforms  the  random  search  and  is  capable 
of  offering  more  diverse  and  superior  quality  solutions  when 
compared  with  the  partial  search,  using  only  10%  of  the 
execution  time. 

4.2.  Accuracy  of  Image  Registration 

An  important  performance  measure  for  any  image  registra¬ 
tion  algorithm,  especially  in  the  context  of  medical  imaging, 
is  its  accuracy.  We  did  not  choose  registration  accuracy  as 
an  objective  function  because  of  its  dependence  on  data 
(image  pairs),  the  degree  of  misalignment  between  images, 
and  the  behavior  of  the  optimization  algorithm  that  is  used 
for  image  registration.  These  factors,  along  with  its  execution 
time,  in  our  experience,  may  render  registration  accuracy 


as  an  unsuitable  objective  function,  especially  if  there  is 
nonmonotonic  behavior  with  respect  to  the  wordlength  of 
design  variables.  Another  important  aspect  is  that  the  desired 
accuracy  of  registration  depends  on  the  application  in  which 
image  registration  is  employed.  For  example,  during  an 
image-guided  medical  procedure  high  registration  accuracy 
might  be  desired,  whereas  in  a  simple  visualization  task, 
slightly  inaccurate  image  registration  may  be  tolerated.  Fur¬ 
thermore,  in  a  multiresolution  image  registration  approach 
slightly  inaccurate  (but,  hardware  resource-efficient)  design 
configuration  can  be  employed  at  the  initial  levels  and  a  more 
accurate  (but  perhaps  requiring  more  hardware  resources) 
design  configuration  can  be  used  at  later  levels.  Thus,  image 
registration  accuracy  is  a  constraint  from  an  application 
perspective  and,  as  such,  is  not  used  to  guide  the  exploration 
of  the  design  space.  Instead,  we  used  error  in  the  MI 
calculation,  which  is  relatively  less  application-  and  data- 
dependant,  as  an  objective  function. 

Once  the  Pareto -optimized  tradeoffs  between  MI  calcu¬ 
lation  error  and  hardware  resources  are  obtained  through 
the  presented  approach,  a  system  designer  could  evaluate 
the  performance  of  these  Pareto -optimized  design  configu¬ 
rations  in  the  context  of  a  specific  target  application.  This 
can  be  done  by  using  a  set  of  sample  image  pairs  acquired 
for  that  target  application.  To  demonstrate  the  feasibility  of 
this  approach,  we  selected  CT-CT  registration  as  an  example 
application.  We  randomly  selected  five  clinical  image  pairs 
for  this  analysis  and  registered  them  using  design  config¬ 
uration  corresponding  to  each  Pareto -optimized  solution. 
These  image  pairs  had  the  dimensions  of  256  X  256  X 
212-335  voxels  and  the  resolution  of  1.4-1. 7  mm  X  1.4- 
1.7  mm  X  1.5  mm.  This  image  registration  was  performed 
using  the  aforementioned  bit-true  simulator.  The  result 
of  registration  was  then  compared  with  that  obtained 
using  double-precision  software  implementation.  Registra¬ 
tion  accuracy  was  calculated  by  comparing  deformations  at 
the  vertices  of  a  cuboid  (with  size  equal  to  half  the  image 
dimensions)  located  at  the  center  of  the  image.  The  results 
of  this  analysis,  which  establish  the  relationship  between  MI 
calculation  error  and  the  registration  error  specific  to  this 
application  of  CT-CT  registration,  are  reported  in  Figure  9. 
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MI  calculation  error 

♦  Registration  error  for  Pareto -optimal  solutions 
-  Polynomial  regression  ( R 2  =  0.93) 

Figure  9:  Relationship  between  MI  calculation  error  and  image 
registration  error  for  an  example  application  of  CT-CT  registration. 


It  must  be  noted  that  each  point  in  this  plot  represents  a  valid 
design  configuration.  As  expected,  there  is  a  good  correlation 
between  the  MI  calculation  error  and  the  accuracy  of  image 
registration.  This  demonstrates  that  optimized  tradeoff 
curves  between  MI  calculation  error  and  hardware  cost,  as 
identified  by  our  reported  analysis,  can  be  used  to  represent 
the  relationships  between  registration  accuracy  and  hardware 
cost  with  high  fidelity.  These  relationships  can  then  be  used 
to  identify  a  design  configuration  in  order  to  achieve  desired 
registration  accuracy  for  this  example  application  of  CT- 
CT  registration.  Similar  relationships  specific  to  a  target 
application  (e.g.,  PET-CT  registration)  can  be  generated 
using  the  approach. 

4.3.  Postsynthesis  Validation 

We  performed  further  validation  of  the  presented  mul¬ 
tiobjective  optimization  strategy  through  physical  design 
synthesis.  We  identified  three  solutions  from  the  Pareto- 
optimized  set  obtained  using  the  EA-based  search  and 
synthesized  the  aforementioned  architecture  with  configura¬ 
tions  corresponding  to  these  solutions.  These  solutions  were 
identified  with  no  specific  clinical  application  in  mind,  but 
such  that  the  tradeoff  between  various  objective  functions 
(MI  calculation  error,  area,  and  memory)  can  be  readily 
appreciated.  Figure  9  reports  the  registration  accuracy  (cal¬ 
culated  using  the  bit-true  emulator  that  we  developed)  for 
all  the  Pareto -optimized  design  configurations.  The  system 
designer  will  have  access  to  all  the  Pareto- optimized  design 
configurations  along  with  their  expected  MI- calculation 
error  and  hardware  resource  requirements,  and,  as  such,  can 
select  a  design  configuration  to  meet  the  requirements  of  a 
given  application. 

These  three  configurations,  which  offer  gradual  tradeoff 
between  hardware  resource  requirement  and  error  in  MI 
calculation,  are  listed  in  the  first  column  of  Table  4.  The 
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wordlengths  associated  with  each  configuration  correspond 
to  the  FWLs  of  the  design  variables  identified  in  Table  1. 
The  design  was  synthesized  for  these  configurations  and 
the  resulting  realizations  were  implemented  using  an  Altera 
Stratix  II  EP2S180F1508C4  FPGA  (Altera  Corporation, 
San  Jose,  mmmm  )  on  a  PCI  prototyping  board 
(DN7000K10PCI)  manufactured  by  the  Dini  Group  (La 
Jolla,  Calif,  US ).  We  then  evaluated  the  performance  of 
the  synthesized  designs  and  compared  it  with  that  predicted 
by  the  objective  function  models.  The  results  of  this  analysis 
are  summarized  in  Table  4  and  are  described  below. 

The  error  in  MI  calculation  was  computed  by  comparing 
the  MI  value  reported  by  the  limited-precision  FPGA 
implementation  against  that  calculated  by  a  double-precision 
software  implementation.  The  MI  calculation  error  was 
averaged  for  three  distinct  image  pairs  and  for  50  randomly 
generated  image  transformations  for  each  pair.  These  image 
pairs  and  the  associated  transformations  were  identical  to 
those  employed  in  the  objective  function  calculation.  In 
this  case,  the  average  MI  calculation  error  obtained  by  all 
the  design  configurations  was  identical  to  that  predicted 
by  the  objective  function  model.  This  is  expected  because 
of  the  bit-true  nature  of  the  simulator  used  to  predict  the 
MI  calculation  error.  We  repeated  this  calculation  with  a 
different  set  of  three  image  pairs  and  50  randomly  generated 
new  transformations  associated  with  each  image  pair.  The 
MI  calculation  error  corresponding  to  this  setup  is  reported 
in  the  second  column  of  Table  4.  The  small  difference  when 
compared  with  the  error  predicted  by  the  models  is  explained 
by  the  different  sets  of  images  and  transformations  used. 
The  area  and  memory  requirements  corresponding  to  each 
configuration  after  synthesis  are  reported  in  columns  three 
and  four  of  Table  4,  respectively.  For  comparison,  we  have 
also  included  the  values  predicted  by  the  corresponding 
objective  function  models  in  parenthesis.  It  must  be  noted 
that  for  all  three  configurations,  the  relative  ordering  based 
on  Pareto -dominance  relationships  with  respect  to  each 
objective  function  is  identical  for  both  postsynthesis  and 
model-predicted  values. 

We  also  evaluated  the  accuracy  of  image  registration 
performed  using  the  implementation  corresponding  to  each 
design  configuration.  For  this  analysis,  we  considered  the 
same  five  CT  image  pairs  described  above.  As  reported 
earlier,  these  image  pairs  had  dimensions  of  256  X  256  X 
212-335  voxels  and  the  resolution  of  1.4-1. 7  mm  X  1.4- 
1.7  mm  X  1.5  mm.  The  image  registration  results  for  one 
of  those  image  pairs  are  illustrated  in  Figure  10.  The  result 
of  registration  between  the  remaining  image  pairs  was  also 
qualitatively  similar.  The  registration  error  was  calculated 
by  comparing  the  obtained  registration  results  with  that 
obtained  using  double-precision  software  implementation. 
The  mean  and  standard  deviations  of  the  registration  error 
corresponding  to  each  configuration  are  reported  in  Table  4. 
Good  correlation  is  seen  between  the  MI  calculation  error 
and  the  registration  error,  reinforcing  the  results  presented 
in  Section  4.2. 

The  performance  of  the  resultant  design  configuration 
in  terms  of  its  raw  clock  rate  is  an  important  measure  of 
the  quality  of  a  design.  This  clock  rate  directly  affects  the 
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(d)  (e)  (f) 


Figure  10:  Results  of  image  registration  performed  using  the  high-speed,  reconfigurable  implementation:  (a)  and  (b)  two  distinct  poses; 
(c)  fusion  of  (a)  and  (b)  using  a  checkerboard  pattern.  The  misalignment  between  images  is  evident  at  the  edges  of  the  squares  within 
the  checkerboard  pattern;  (d)-(f)  fusion  images  after  registration  using  the  identified  design  configurations.  These  configurations  offer 
progressively  reduced  image  registration  error  (3.82  mm,  1.57  mm,  and  0.45  mm,  resp.)  and  result  in  correspondingly  improved  image 
alignment.  The  arrows  indicate  representative  regions  with  misalignment  that  are  better  aligned  after  registration. 


Table  4:  Validation  of  the  objective  function  models  using  postsynthesis  results.  The  wordlengths  in  a  design  configuration  correspond  to 
the  FWLs  of  the  design  variables  identified  earlier  (see  Table  1). 

Design  Objective  functions  postsynthesis  value  (predicted  value)  Registration  error  (mean  ±  Design  speed  (/max) 

configuration  Area  (no  of  standard  deviations,  mm)  (MHz) 

MI  calculation  error  T  \*  Memory  (Mbits) 


{5, 6, 4, 9} 

2.4  X  10~3  (2.1  X  10~3) 

6527  (5899) 

2.23  (2.23) 

3.82  ±  1.24 

211 

{8, 9, 7,12} 

5.3  X  10  4  (5.2  x  10  4) 

7612 (6754) 

2.45  (2.45) 

1.57  ±0.69 

197 

{9,12,10,17} 

7.7  X  10-5  (7.8  X  10-5) 

10356  (8073) 

2.81  (2.81) 

0.45  ±0.16 

184 

maximum  voxel  throughput  that  can  be  achieved  by  the 
design  and,  consequently,  has  an  impact  on  the  execution 
speed  of  image  registration.  The  speed  of  a  design  config¬ 
uration  depends  on,  among  other  factors,  the  wordlengths 
of  the  design  parameters.  For  example,  performing  arith¬ 
metic  and  memory  operations  using  parameters  with  wider 
wordlengths  may  incur  additional  latency.  As  a  result,  design 
configurations  employing  design  parameters  with  wider 
wordlengths  may  be  slightly  slower,  although  more  accurate, 
than  design  configurations  with  shorter  wordlengths.  To 
provide  some  insights  about  this  phenomenon,  we  recorded 
the  maximum  clock  rate  achieved  by  each  of  the  design 
configurations  we  identified  for  synthesis.  This  represents  the 
maximum  postsynthesis  frequency  at  which  the  design  can 
operate  and  is  reported  in  the  last  column  of  Table  4.  These 
results  indicate  that  the  Pareto -optimized  designs  are  not 


unreasonably  slow  and  that  their  performance  is  comparable 
to  that  achieved  (200  MHz)  for  a  user-optimized  design 
reported  earlier  [6]. 

This  postsynthesis  validation  further  demonstrates  the 
efficacy  of  the  presented  optimization  approach  for  reconfig¬ 
urable  implementation  of  image  registration.  It  also  further 
demonstrates  how  the  approach  enables  a  designer  to  sys¬ 
tematically  choose  an  efficient  system  configuration  to  meet 
the  registration  accuracy  requirements  for  a  reconfigurable 
implementation. 
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5.  Discussion 

With  the  need  for  real-time  performance  in  signal  processing 
applications,  an  increasing  trend  is  to  accelerate  computa¬ 
tionally  intensive  algorithms  using  custom  hardware  imple¬ 
mentation.  A  critical  step  in  going  to  a  custom  hardware 
implementation  is  converting  floating-point  implementa¬ 
tions  to  fixed-point  realizations  for  performance  reasons. 
This  conversion  process  is  an  inherently  multidimensional 
problem  because  several  conflicting  objectives,  such  as  area 
and  error,  must  be  simultaneously  minimized.  By  system¬ 
atically  deriving  efficient  tradeoff  configurations,  one  can 
not  only  reduce  the  design  time  [10]  but  can  also  enable 
automated  design  synthesis  [41, 42] .  Moreover,  these  tradeoff 
configurations  allow  designers  to  identify  optimized,  high- 
quality  designs  for  reconfigurable  computing  applications. 
Our  work  presented  in  this  paper  develops  a  framework 
for  optimizing  tradeoff  relations  between  hardware  cost  and 
image  processing  accuracy  in  the  context  of  FPGA-based 
medical  image  registration. 

Earlier  approaches  to  optimizing  wordlengths  used  ana¬ 
lytical  approaches  for  range  and  error  estimations  [Il¬ 
ls].  Some  of  these  have  used  the  error  propagation  method 
(e.g.,  see  [14]),  whereas  others  have  employed  models  of 
worst-case  error  [12,  15].  Although  these  approaches  are 
faster  and  do  not  require  simulation,  formulating  analytical 
models  for  complex  objective  functions,  such  as  MI,  is 
difficult.  Statistical  approaches  have  also  been  employed  for 
optimizing  wordlengths  [43,  44].  These  methods  employ 
range  and  error  monitoring  for  identifying  appropriate 
wordlengths.  These  techniques  do  not  require  range  or  error 
models.  However,  they  often  need  long  execution  times  and 
are  less  accurate  in  determining  effective  wordlengths. 

Some  published  methods  search  for  optimum 
wordlengths  using  error  or  cost  sensitivity  information. 
These  approaches  are  based  on  search  algorithms  such 
as  “Local,”  “Preplanned,”  and  “Max-1”  search  [16,  45]. 
However,  for  a  given  design  scenario,  these  methods  are 
limited  to  finding  a  single-feasible  solution,  as  opposed  to 
a  multiobjective  tradeoff  curve.  In  contrast,  the  techniques 
that  we  have  presented  in  this  paper  are  capable  of  deriving 
efficient  tradeoff  curves  across  multiple  objective  functions. 

Other  heuristic  techniques  that  take  into  account  trade¬ 
offs  between  hardware  cost  and  implementation  error  and 
enable  automatic  conversion  from  floating-point  to  fixed- 
point  representations  are  limited  to  software  implemen¬ 
tations  only  [42].  Also,  some  of  the  methods  based  on 
heuristics  do  not  support  different  degrees  of  fractional 
precision  for  different  internal  variables  [  12] .  In  contrast,  our 
framework  allows  multiple  fractional  precisions,  supports  a 
variety  of  search  methods,  and  thereby  captures  more  com¬ 
prehensively  the  complexity  of  the  underlying  multiobjective 
optimization  problem. 

Other  approaches  to  solve  this  multiobjective  problem 
have  employed  weighted  combinations  of  multiple  objectives 
and  have  reduced  the  problem  to  mono -objective  optimiza¬ 
tion  [38].  This  approach,  however,  is  prone  to  finding  sub- 
optimal  solutions  when  the  search  space  is  nonconvex  [17]. 
Some  methods  have  also  attempted  to  model  this  problem  as 
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a  sequence  of  multiple  mono -objective  optimizations  [46]. 
The  underlying  assumption  in  this  approximation,  however, 
is  that  the  design  parameters  are  completely  independent, 
which  is  rarely  the  case  in  complex  systems.  Modeling  this 
problem  as  an  integer  linear  programing  formulation  has 
also  been  shown  to  be  effective  [11].  But  this  approach  is 
limited  to  cases  in  which  the  objective  functions  can  be 
represented  or  approximated  as  linear  functions  of  design 
variables. 

EAs  have  been  shown  to  be  effective  in  solving  various 
kinds  of  multiobjective  optimization  problems  [18,  19] 
but  have  not  been  extensively  applied  to  finding  optimal 
wordlength  configurations.  One  of  the  earlier  attempts  at 
using  multiobjective  EA  formulation  for  wordlength  opti¬ 
mization  was  reported  by  Istepanian  and  Whidborne  [47]. 
This  approach  employed  a  simplistic  model  for  hardware 
complexity  and  was  limited  to  linear  systems  only.  Leban  and 
Tasic  [48]  also  reported  EA-based  wordlength  optimization 
of  adaptive  filters.  However,  this  work  was  limited  to  mono¬ 
objective  optimization  only.  More  recently,  Han  et  al.  [49] 
reported  EA-based  multiobjective  wordlength  optimization 
for  a  filtering  application.  This  work,  however,  considered 
only  linear  objective  functions  and  lacked  postsynthesis  vali¬ 
dation.  In  contrast,  our  work  demonstrates  the  applicability 
of  EA-based  search  for  finding  superior  Pareto -optimized 
solutions  in  an  efficient  manner,  even  in  the  presence  of 
a  nonlinear  objective  function.  Moreover,  our  optimization 
framework  supports  multiple  search  algorithms  and  objec¬ 
tive  function  models  and  may  be  extended  to  a  wide  range  of 
other  signal  processing  applications.  A  preliminary  version 
of  the  work  presented  in  this  article  is  published  in  [50] .  This 
paper  represents  an  enhanced  and  more  thorough  version 
of  that  work.  New  developments  that  we  have  incorporated 
into  this  paper  include  elaborating  on  the  parameterized 
architectural  design,  evaluating  the  fidelity  of  the  objective 
function  models,  and  verifying  the  applicability  of  the 
proposed  methodology  through  postsynthesis  validation.  In 
summary,  this  work  has  presented  a  framework  that  is 
capable  of  performing  multiobjective  wordlength  optimiza¬ 
tion  and  identifying  Pareto -optimized  design  configurations 
even  in  the  context  of  nonlinear  and  complex  objective 
functions.  Through  postsynthesis  validation,  this  work  has 
also  demonstrated  the  feasibility  of  such  a  multiobjective 
optimization  framework  in  the  context  of  a  representative 
image  processing  application,  medical  image  registration. 

6.  Conclusion 

One  of  the  main  strengths  of  reconfigurable  computing  over 
general-purpose  processor-based  implementations  is  its  abil¬ 
ity  to  utilize  more  streamlined  representations  for  internal 
variables.  This  ability  can  often  lead  to  superior  performance 
and  optimized  fabric  utilization  in  reconfigurable  computing 
applications.  Given  this  advantage,  it  is  highly  desirable  to 
automate  the  derivation  of  optimized  design  configurations 
that  can  be  switched  among  at  run  time.  Toward  that  end, 
this  paper  has  presented  a  framework  for  multiobjective 
wordlength  optimization  of  finite- precision,  reconfigurable 
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implementations.  This  framework  considers  multiple  con¬ 
flicting  objectives,  such  as  hardware  resource  consumption 
and  implementation  accuracy,  and  systematically  explores 
tradeoff  relationships  among  the  targeted  objectives.  Our 
work  has  also  further  demonstrated  the  applicability  of  EA- 
based  techniques  for  efficiently  identifying  Pareto -optimized 
tradeoff  relations  in  the  presence  of  complex  and  non¬ 
linear  objective  functions.  The  evaluation  that  we  have 
performed  in  the  context  of  FPGA-based  medical  image 
registration  demonstrates  that  such  an  analysis  can  be  used  to 
enhance  automated  hardware  design  processes  and  efficiently 

identifies  a  system  configuration  that  meets  given  design 
constraints.  Furthermore,  the  multiobjective  optimization 
approach  that  we  have  presented  is  not  application- specific 
and,  with  additional  work,  may  be  extended  to  a  multitude 
of  other  signal  processing  applications. 
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Augmented  Reality  for  Laparoscopic  Surgery  Using  a  Novel  Imaging  Method  -  Initial 
Results  from  a  Porcine  Animal  Model 

R  Shekhar  PhD.  C  Godinez  MD,  S  Kavic  MD,  E  Sutton  MD,  V  Bhat,  O  Dandekar,  I  George,  A 
Park  MD. 

Department  of  Surgery,  University  of  Maryland  School  of  Medicine 

Background:  Intraoperative  appreciation  of  visible  anatomy  along  with  awareness  of  underlying 
structures  and  vasculature  is  invaluable  to  the  operating  surgeon.  The  advent  of  minimally 
invasive  techniques,  with  reduced  tactile  feedback  and  limited  visual  displays  has  only 
heightened  the  need  for  improved  visualization  of  target  anatomy  and  adjacent  but  visually 
imperceptible  structures.  Current  laparoscopic  images  are  rich  in  surface  detail  but  provide  no 
information  on  deeper  features.  We  are  developing  a  novel  method  of  performing  laparoscopic 
surgery  using  a  64-slice  computed  tomography  (CT)  scanner  with  continuous  scanning  capability. 
This  study  describes  our  work  to  date  to  produce  an  augmented  reality  (AR)  image  that 
instantaneously  renders  intraoperative  CT  images  with  the  live  images  from  the  laparoscope. 

Methods:  Under  an  Institutional  Animal  Care  and  Use  Committee  (lACUC)-approved  protocol, 
we  conducted  a  series  of  CT-guided  laparoscopic  operations  using  a  non-survival  porcine  model. 
A  fully  equipped  laparoscopic  surgical  suite  was  assembled  within  the  CT  scan  room.  A 
multidisciplinary  research  team  comprised  of  minimally  invasive  surgeons,  radiologists,  and 
biomedical  engineers  contributed  to  study  design  and  conducted  the  experiments.  We  employ  a 
64-slice  CT  scanner  with  continuous  scanning  capability  to  image  the  surgical  field  approximately 
once  per  second.  An  infrared  detection  system  tracked  the  position  of  a  specially-equipped 
laparoscope  in  order  to  reconcile  the  laparoscopic  view  with  the  corresponding  3-D  CT  image. 
Laparoscopic  operations  performed  included  peritoneoscopy,  cholecystectomy,  hepatic  wedge 
resection,  and  gastrorrhapy,  with  intraoperative  CT  scanning.  Deformable  image  registration 
(alignment)  techniques  and  low-dose  reconstruction  methods  allow  intraoperative  CT  scanning  at 
25  mAs,  roughly  10  times  lower  than  the  standard  diagnostic  dose.  Using  commercially  available 
software,  we  generate  an  AR  image  that  merges  recontructed  intraoperative  CT  with  images  from 
the  laparoscope. 

Results  and  Conclusions:  Through  a  series  of  six  operative  experiments,  we  have  amassed  a 
data  set  that  includes  rendered  video  and  laparoscopic  images,  demonstrating  the  feasibility  of 
merging  optical  surface  information  with  ragiographically  imaged  deep  anatomic  features  (Fig  2). 
Our  method  represents  an  accurate,  instantaneous  high  refresh-rate  approach  to  AR,  which  we 
have  termed  “live  AR.”  These  initial  experiments  represent  the  first  use  of  a  new  surgical 
visualization  capability,  with  potential  to  significantly  enhance  operative  performance. 
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Routine  clinical  information  systems  now  have  the 
ability  to  gather  large  amounts  of  data  that  surgical 
managers  can  access  to  create  a  seamless  and  proactive 
approach  to  streamlining  operations  and  minimizing 
delays.  The  challenge  lies  in  aggregating  and  displaying 
these  data  in  an  easily  accessible  format  that  provides 
useful,  timely  information  on  current  operations.  A 
Web-based,  graphical  dashboard  is  described  in  this 
study,  which  can  be  used  to  interpret  clinical  opera¬ 
tional  data,  allow  managers  to  see  trends  in  data,  and 
help  identify  inefficiencies  that  were  not  apparent  with 
more  traditional,  paper-based  approaches.  The  dash¬ 


board  provides  a  visual  decision  support  tool  that 
assists  managers  in  pinpointing  areas  for  continuous 
quality  improvement.  The  limitations  of  paper-based 
techniques,  the  development  of  the  automated  display 
system,  and  key  performance  indicators  in  analyzing 
aggregate  delays,  time,  specialties,  and  teamwork  are 
reviewed.  Strengths,  weaknesses,  opportunities,  and 
threats  associated  with  implementing  such  a  program 
in  the  perioperative  environment  are  summarized. 

Keywords:  graphical  dashboarding;  business  intelli¬ 
gence;  scorecarding;  information  visualization;  quality 


Management  of  the  modern  perioperative 
environment  is  a  challenging  act  of  bal¬ 
ance  and  orchestration  that  often  tilts  per¬ 
ilously  close  to  chaos.  Many  unforeseen  delays 
(including,  but  by  no  means  limited  to,  patient 
transport;  case  cart  preparation;  consent  forms;  and 
slow  turnover)  can  trigger  a  cascade  of  events  that 
escalate  throughout  the  day,  resulting  in  frustration 
for  physicians,  staff,  and  patients.  The  cumulative 
effects  of  many  small  and  interacting  delays  keep 
the  operating  room  (OR)  from  running  at  peak  effi¬ 
ciency  and  can,  in  some  cases,  contribute  to  more 
serious  errors  in  management  and  care. 
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Managers  trying  to  address  these  delays  in  an  ad 
hoc  fashion  find  themselves  playing  “whack-a-mole” 
with  serial  problems;  as  soon  as  one  glitch  is 
resolved,  another  rears  its  head.  The  result  is  a  reac¬ 
tive  approach  that  focuses  on  immediate  problems  at 
the  expense  of  the  time  and  effort  needed  to  identify 
root  causes  and  long-term  solutions  that  will  prevent 
recurrences.  The  good  news  is  that  various  routine 
clinical  information  systems  now  gather  enormous 
volumes  of  data  that  surgical  managers  can  access  to 
create  a  seamless  and  proactive  approach  to  stream¬ 
lining  operations  and  minimizing  delays. 

However,  the  process  of  leveraging  these  data  in 
support  of  routine  improvements  presents  its  own 
challenges,  particularly  in  rethinking  traditional 
reporting  and  analysis  techniques.  In  the  past,  paper 
spreadsheet— based  reporting  methodologies  have 
been  used  in  management  meetings,  an  approach 
that  is  insufficient  to  handle  or  analyze  even  the 
broadest  trends  in  the  increasingly  large  volumes  of 
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useful  data  collected.  This  traditional,  tactical 
approach  most  often  focuses  on  only  the  short-term 
history  of  operations  and  fails  to  identify  the  small, 
recurrent  delays  that  may  occur  across  services. 

Transparency  and  a  broad  scope  of  accountabil¬ 
ity  are  widely  recognized  as  hallmarks  of  high  relia¬ 
bility  and  dedication  to  quality  in  health  care 
organizations.1  These  values,  along  with  an  emphasis 
on  verifiable  metrics  and  automated  means  of  collec¬ 
tion  and  assessment,  have  figured  in  significant 
advances  in  operations  research  in  the  management 
of  the  surgical  environment  in  the  past  5  years.2'6 
These  advances  contribute  to  the  fulfillment  of 
important  goals  of  surgical  units,  including  patient 
safety,  access  to  ORs,  economic  efficiency,  waiting 
time,  and  staff  satisfaction.6  Moreover,  they  have 
provided  novel  information  about  what  factors  con¬ 
tribute  to  which  specific  quality  goals.  Simulation 
studies  using  operations  research,  for  example,  have 
indicated  that  although  immediate  quality  improve¬ 
ments  in  patient  safety,  waiting  time,  and  satisfaction 
on  the  day  of  surgery  should  be  a  primary  focus,  only 
longer  term  decisions  on  staffing  will  provide  eco¬ 
nomic  efficiencies.6,7  Thus,  in  general,  reduction  in 
turnover  time  will  not  result  in  increased  volume,8 
but  access  to  historical  data  and  application  of  oper¬ 
ations  research  methods  can  point  to  staffing  solu¬ 
tions  that  will  optimize  economic  efficiency.9 

Our  goal,  as  part  of  a  grant  on  the  OR  of  the 
Future  from  the  Telemedicine  and  Advanced 
Technology  Research  Center,  was  to  accelerate  the 
adoption  of  these  advances  by  providing  an  auto¬ 
mated,  holistic  view  of  operations  that  would  enable 
the  managers  to  discover  patterns  and  causes  of 
delays.  We  created  a  Web-based,  graphical  dash¬ 
board  that  could  be  used  to  interpret  clinical  opera¬ 
tional  data,  could  allow  managers  to  see  trends  in 
data,  and  could  help  identify  inefficiencies  that  were 
not  apparent  with  more  traditional  approaches.  This 
dashboard  was  designed  to  provide  a  visual  decision 
support  tool  that  would  also  assist  managers  in  pin¬ 
pointing  problem  areas  in  which  the  greatest  bene¬ 
fits  could  be  achieved  by  applying  time  and  energy 
toward  continuous  quality  improvement. 

What  is  Business  Intelligence? 

The  field  of  business  intelligence,  sometimes  referred 
to  as  business  analytics,  is  the  utilization  of  data  ware¬ 
housing,  data  mining,  modeling,  and  forecasting  to 


aid  in  managerial  decision  support  systems.1011 
business  intelligence  is  defined  as  extracting  useful 
information  from  the  data  generated  by  operational 
systems  of  an  enterprise.12  Many  top  corporate  exec¬ 
utives  use  business  intelligence— generated  electronic 
scorecarding  and  dashboarding  methodologies  to 
manage  their  operations  with  real  time— decision 
making  support.  A  2007  Gartner  Inc  worldwide 
survey  of  1400  chief  information  officers  ranked 
business  intelligence  as  the  number  1  technology 
priority  for  remaining  strategically  competitive.13 
business  intelligence  methodologies  extend  directly 
to  consumers  in  certain  markets.  Financial  Web  sites 
provide  individual  investors  with  extensive  research 
and  graphical  performance  analysis  of  publicly 
traded  companies. 

Large  academic  medical  centers,  which  often 
generate  revenues  in  excess  of  US$200  million,  are 
in  the  same  financial  league  as  the  medium-to-large 
businesses  in  which  dashboards  are  commonplace. 
Yet  few  medical  centers  have  invested  in  the  devel¬ 
opment  and  routine  implementation  of  tools  to  ana¬ 
lyze  and  improve  the  efficiency  and  effectiveness  of 
perioperative  management  and  operations.  Many 
ORs  continue  to  “fly  blind”  regarding  concepts  such 
as  indexing  and  performance  measurement. 

Internal  graphical  dashboards  have  been  proven 
in  other  environments  to  provide  useful  and  produc¬ 
tive  platforms  for  continuous  quality  improvement. 
The  Six  Sigma  quality  methodology,  for  example,  fre¬ 
quently  uses  dashboards  for  process  management.14 
A  dashboard  can  provide  a  consistent  framework  of 
defined  metrics,  known  as  key  performance  indica¬ 
tors,  that  aid  in  defining  and  redefining  quality  and 
goals,  as  well  as  offering  quantifiable  data  on 
achievements.15  Evidence  of  consistent  improve¬ 
ments  through  a  public  dashboard  is  then  used  to 
help  align  the  various  parts  of  an  organization  to  tar¬ 
get  enhanced  performance. 

The  Potential  of  Information 
Visualization 

Our  project  was  designed  to  provide  a  visual  knowl¬ 
edge  exploration  system  to  assist  managers  and  sen¬ 
ior  leadership  in  understanding  trends  and  patterns. 
The  tools  used  in  this  system  provide  interactive 
views  of  data  at  various  granularities  and  in  a  series 
of  graphical  or  tabular  formats.  These  tools  can 
quickly  and  with  minimal  user  effort  impose  various 
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types  of  analyses  on  the  full  data  set  or  on  interac¬ 
tively  selected  subsets  of  data. 

The  goal  of  a  visual  knowledge  exploration  sys¬ 
tem  is  to  provide  tools  that  facilitate  interaction  with 
information  in  an  easy,  transparent,  and  meaningful 
manner.  A  well-designed  graph  can  tap  into  the 
pattern-recognition  capabilities  of  the  human  visual 
system.  In  certain  types  of  patterns,  human  vision 
can  identify  a  unique  (outlier)  value  within  200  mil¬ 
liseconds,  regardless  of  whether  few  or  many  data 
points  are  present.16,17  However,  this  ability  is 
entirely  dependent  on  the  manner  in  which  the  pat¬ 
tern  is  displayed.  Proper  visual  display  is  crucial  to 
the  use  of  large  data  sets  for  complex  decision-making 
support.  The  optimal  type  of  data  display  has  been 
the  focus  of  a  substantial  body  of  literature  and 
reporting,  and  the  definitive  answer  changes  as  rap¬ 
idly  as  new  technologies  enter  the  information 
arena.18'22  Some  studies  suggest  the  superiority  of 
graphical  formats  (bar  charts,  pie  charts,  etc)  over 
tabular  presentation  (data  tables)  for  certain  tasks, 
whereas  the  reverse  is  true  for  other  tasks.  More 
recent  work  indicates  that  a  constellation  of  factors 
must  be  considered  in  determining  the  most  advan¬ 
tageous  data  set  display  formats,  including  type  of 
task,  underlying  structure  of  the  data,  and  the 
knowledge  level  of  the  users.23 

What  Is  the  Problem  with 
Paper-Based  Reporting? 

The  benefits  of  graphical  dashboarding  can  be 
appreciated  more  fully  by  looking  at  the  limitations 
of  traditional  paper-based  reporting  in  identifying 
and  managing  ongoing  operations  challenges. 
Understanding  these  limitations  is  important  because 
in  many  institutions  paper-based  reporting  is  so 
engrained  into  routine  practice  that  clinicians  and 
perioperative  managers  may  find  it  difficult  to  take 
the  steps  needed  to  adapt  to  other  methodologies. 
Paper-based  reporting  management  systems  are  lim¬ 
ited  in  the  following  areas: 

Time 

Significant  time  is  required  to  gather  information 
from  various  sources  and  compile  reports  by  hand. 
Decision  making  is  a  time-sensitive  activity  that 
requires  actionable  information.  Decision  making,  a 
process  that  that  should  be  based  on  fresh  data,  is 


adversely  affected  when  time  simply  does  not  permit 
the  preparation  of  all  possible  permutations  of 
analyses  that  might  be  informative  and  useful. 

Effort 

The  inherent  limitations  of  paper  restrict  the  number 
of  questions  than  can  be  asked  and  tend  to  generalize 
rather  than  to  drill  down  in  areas  of  analyses. 
Expanding  the  scope  of  a  paper  report  requires  extra 
labor.  Most  often,  the  result  is  a  trade-off  between  the 
time  required  to  generate  the  report  and  the  quality  of 
effort  required  in  preparing  the  results  for  analysis. 

Hindsight 

One  of  the  most  frustrating  characteristics  of  paper- 
based  reports  is  that  they  provide  only  answers  to 
questions  that  were  identified  before  the  manage¬ 
ment  meeting  and  discussion.  New  questions  asked 
during  the  meeting  must  be  tabled  until  the  next 
meeting  so  that  analysts  can  gather  the  new  infor¬ 
mation  required.  These  tabled  questions  prevent 
more  purposeful  discussions  about  the  data  and 
leave  managers  with  limited  information  to  support 
decision  making  in  the  short  term. 

Scope  of  Report 

The  amount  of  information  that  can  be  contained  in 
a  paper-based  report  is  limited,  as  is  the  amount  of 
information  that  can  be  reviewed  within  a  reasonable 
amount  of  time.  Selectivity  becomes  a  necessity;  yet, 
it  is  difficult  to  predict  which  questions  managers 
will  have  during  any  given  meeting.  Attempts  to 
broaden  the  scope  of  paper-based  reports  can  be 
both  time  consuming  and  problematic;  the  larger  the 
amount  of  data  in  a  paper  report,  the  more  difficult 
it  is  to  find  any  specific  piece  of  information. 

Granularity 

Aggregate  statistics  do  not  allow  the  user  to  drill 
down  to  understand  the  underlying  distributions  to 
evaluate  credibility.  Mean  statistics  offered  in  most 
paper-based  reports  are  unreliable  when  describing 
nonnormal  distributions  of  data.  A  single  chart  or 
table  on  paper  can  show  only  one  view,  and  it  is  dif¬ 
ficult  to  present  both  overviews  and  detailed  infor¬ 
mation  in  a  single  presentation.  Showing  trends  can 
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obscure  source  data,  where  showing  only  source 
data  can  obscure  trends.  Including  both  or  all  in  a 
paper-based  report  can  be  labor/time  intensive  to 
review  and  is  impractical  as  a  routine  practice. 


Multiple  Versions  of  Truth 

Different  groups  generating  separate  reports  or  even 
the  same  reports  at  different  times  can  result  in  con¬ 
flicting  operation  directives  that  can  add  to  confu¬ 
sion  in  effective  decision  making.  The  human 
intervention  inherent  in  paper-based  reporting  can 
also  introduce  bias  that  may  lead  to  data  analysis 
errors.  Moreover,  the  passage  of  time  between  the 
collection  or  analysis  of  data  and  final  reporting  may 
mean  that  no  direct  links  exist  between  current 
operational  data  and  the  paper-based  report. 

Dashboard  Design 

Organization  of  the  Web  Site 

The  Web  site  was  designed  to  allow  analysis  from 
several  perspectives.  One  of  the  principle  mantras  of 
information  visualization  and  data  discovery,  identi¬ 
fied  by  Shneiderman,24  is  the  ability  to  overview 
first,  zoom,  filter,  then  details-on-demand.  The  abil¬ 
ity  to  view  data  from  multiple  perspectives  assists 
and  increases  confidence  in  decision  making.  The 
user  accesses  each  category  via  a  navigation  bar  of 
tabs  at  the  top  of  the  Web  site.  As  the  user  navigates 
through  the  system,  a  trail  (called  a  breadcrumb; 
Figure  1)  is  displayed  to  illustrate  how  the  user  nav¬ 
igated  to  that  point  and  to  allow  easy  backtracking. 

To  create  a  concise  visualization  environment, 
we  created  graphs  that  were  clickable  within  a  drill¬ 
down  interface  to  provide  fast  and  intuitive  zooming 
and  filtering  of  data.  These  graphs  also  provide 
detailed  information  about  data  points  when  the 
user  hovers  over  a  data  marker  with  the  mouse.  One 
of  the  core  requirements  for  a  management  dash¬ 
board  is  that  it  be  Web  based  to  allow  secure  access 
to  all  authorized  users  from  any  location  at  any  time. 

Standard  data  warehousing  techniques  were 
used  extraction,  transformation,  and  loading.  The 
database  was  populated  by  parsing  a  text  file  in  a 
comma-separated  delimiter  format  provided  by 
Cerner  SurgiNet  Surgical  Information  System 
(Kansas  City,  Missouri).  The  text  file  was  converted 
to  ANSI  Structured  Query  Language  commands  as 


Figure  1.  A,  Pie  chart  analysis  of  number  of  occurrences  of 
each  delay  type.  B,  Pie  chart  of  the  relative  amount  of  delay 
caused  by  each  delay  type. 


inserts  using  the  MySQL  database  administration 
utility  PhpMyAdmin.  Data  were  anonymized  for 
patient  information  because  the  focus  of  the  system 
was  operational  efficiency,  not  identifying  specific 
patient-associated  incidents.  In  all,  6  months  of 
operational  data  were  uploaded  into  the  system, 
incorporating  performance  statistics  on  7807  cases 
on  8  MB  of  disk  space  on  the  server.  These  cases 
incorporated  all  of  the  operating  rooms  with  both 
inpatient  and  outpatient  admissions. 

Identifying  Key  Performance  Indicators 

The  American  Association  of  Clinical  Directors 
derived  a  common  glossary  of  the  exact  meaning  of 
times  used  for  scheduling  and  monitoring  surgical 
procedures.25  Time  stamps  were  extracted  from  the 
clinical  database  for  scheduled  start  time,  time  at 
which  the  patient  enters  the  operating  room,  the  time 
at  which  surgery  begins,  surgery  end  time,  time  at 
which  the  patient  leaves  the  room,  and  the  turnover 
time  of  the  room.  As  a  hospital  policy,  when  a  case 
begins  1 5  or  more  minutes  later  than  scheduled,  the 
circulation  nurse  must  specify  a  reason  for  the  delay. 
These  performance  data  are  combined  with  data  from 
the  case  such  as  the  room,  surgeon,  anesthesiologist, 
case  number  of  the  day,  and  the  service. 

A  total  of  43  delay  types  were  identified  as 
reasons  for  delays.  These  were  grouped  into  general 
root  causes  of  materials,  patient,  prerequisite  task(s), 
scheduling,  staff,  and  transport.  At  the  top  of  the 
delay  analysis  page,  as  shown  in  Figure  1,  pie  charts 
demonstrate  the  relative  number  of  delays  per  root 
cause  and  their  cumulative  impact  in  time.  Although 
some  delays  are  not  numerous,  they  might  have  a 
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large  effect  on  the  operations  of  an  OR.  The  user 
clicks  on  the  pie  chart,  selects  a  root  cause,  and 
presents  with  an  analysis  page  of  all  the  underlying 
delay  causes  for  that  root  cause,  broken  down  in  the 
same  way  by  relative  number  and  impact.  By  select¬ 
ing  a  delay  cause,  the  system  moves  to  show  a  break¬ 
down  by  specialty,  displaying  the  number  of 
incidents  and  their  average  delay  times.  Selecting  a 
service  displays  all  the  cases,  and  by  selecting  a  case, 
the  details  for  that  case  can  be  displayed.  Within  the 
span  of  4  clicks,  a  user  can  drill  down  from  all  the 
cases  in  the  database  to  the  details  of  an  individual 
case.  The  delay  analysis  tool  is  useful  in  under¬ 
standing  the  cumulative  cost  of  systemic  delays  and 
which  specialties  are  most  affected  by  them. 


Temporal  Analysis 

The  temporal  perspective  provides  a  daily  tactical 
review  of  cases  to  determine  over  a  specified  period 
of  time  which  ones  were  delayed  and  why.  To  present 
the  utilization  levels  of  the  ORs  in  a  given  day,  we 
used  a  polar  chart  showing  cumulative  room  utiliza¬ 
tion  as  a  function  of  the  hour  of  the  day  (Figure  2). 
This  is  useful  for  look  at  the  relationship  between 
room  utilization  and  staffing  levels.  The  user  can 
drill  down  to  the  specifics  of  a  single  case  or  can 
choose  to  look  at  data  grouped  by  room  or  specialty. 
System  delay  types,  such  as  transport  issues,  can 
affect  multiple  rooms  and  specialties  across  suites  of 
ORs  over  different  periods  of  time. 

Service  analysis  focuses  on  key  performance 
indicators  within  each  specialty.  Medical  specialties 
within  the  OR  have  widely  differing  dynamics  for 
case  efficiency,  utilization,  turnover,  and  case  length 
based  on  a  number  of  factors,  including  but  not  lim¬ 
ited  to  procedure  complexity  and  patient  acuity.  For 
some  types  of  data  analysis  involving  services  or  sub¬ 
specialties,  bubble  charts  provided  a  useful  way  to 
organize  data  (Figure  3).  The  bubble  chart  plots 
each  service  by  its  average  case  length  in  the  x  axis 
and  average  delay  duration  in  the  y  axis.  The  size  of 
the  bubble  for  each  specialty  is  directly  proportional 
to  the  number  of  cases  performed.  The  more  cases  a 
service  performs,  the  larger  becomes  the  diameter  of 
the  bubble. 

For  each  specialty  analysis,  the  site  provides 
histograms  for  case  length  and  delay  duration. 
Histographic  analysis  is  useful  in  determining  the 
distribution  type,  the  spread  of  the  distribution,  and 


Figure  2.  A  polar  chart  of  room  occupation  as  a  function  of 
the  hour  of  the  day.  OR  indicates  operating  room. 


University  of  Maryland  School  of  Medicin 


Department  Of  Surgery:  OR  Logistics 
Analysis 


Start  Services  Delay  Codes  Room  Time  Surgeon  Anesthes 
Bread  Crumbs:  Services 

Service  Delay  Analysis 


Average  Case  Length  (minutes) 


Figure  3.  A  bubble  chart  of  the  services  within  surgery  plot¬ 
ted  along  their  average  case  length  versus  their  average  delay 
duration.  The  size  of  each  bubble  is  proportional  to  the  number 
of  cases  performed  by  the  service. 


the  existence  of  outliers  that  may  distort  statistical 
analysis. 

Another  display  generated  was  a  scatter  graph  of 
all  cases  plotted  by  their  scheduled  case  lengths 
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Tot  Patient  In  Room  Analysis 


□  95%  Line  Confidence  □  95%  Point  Confidence 


Figure  4.  A  scatter  plot  with  confidence  banding  plotting 
the  scheduled  in  room  duration  versus  the  actual  patient  time  in 
the  room. 


compared  with  actual  duration  (Figure  4).  Regression 
analysis  shows  potential  correlations  along  with 
graphical  bands  illustrating  confidence  intervals  for 
the  line  and  the  points.  This  index  of  predictability 
of  scheduling  is  especially  useful  in  identifying  and 
drilling  down  on  the  outlier  cases  to  understand 
their  causes  of  variance.  Each  diamond  represents 
an  individual  case,  and  by  clicking  on  a  diamond, 
the  details  of  that  case  are  displayed  (Figure  5). 

Teamwork  Analysis 

With  a  surgeon  and  principal  anesthesiologist 
assigned  to  each  case,  we  can  display  delay  causes  for 
those  cases.  As  shown  in  the  spider  graph  in  Figure 
6,  delays  are  grouped  by  and  aggregated  by  the  blue 
bars.  The  farther  were  the  bars,  the  greater  were  the 
number  of  occurrences  for  that  root  cause.  The  over¬ 
lapping  orange  are  the  average  delays  by  root  cause 
for  all  physicians  in  that  specialty,  normalized  by  the 
number  of  procedures  done  by  that  physician. 

Using  this  teamwork  analysis,  it  is  possible  to 
identify  specific  teams  that  appear  to  work  well  together 
and  those  that  are  not  routinely  time  efficient.  Other 
factors,  of  course,  must  be  considered  in  reviewing 
these  data,  and  it  would  be  difficult  to  determine 
root  causes  for  efficiency  or  inefficiency  in  a  specific 
case.  However,  this  knowledge  may  provide  strategic 
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75 
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35 
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42 
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0 
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67 

Scheduled  Case  Duration 

405 
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2 

Figure  5.  A  detailed  report  of  a  given  case.  OR  indicates 
operating  room. 

information  that  could  contribute  to  what  business 
intelligence  experts  call  a  discovery  cascade. 

Results 

Results  of  an  initial  rollout  of  the  Web  system  were 
assessed  through  interviews  with  senior  management. 
This  included  discussions  with  the  chief  medical  offi¬ 
cer,  chief  operations  officer,  chief  nursing  officer, 
chairs  of  surgery  and  anesthesiology,  and  several 
perioperative  managers.  In  presenting  this  poten¬ 
tially  disruptive  tool  to  management,  we  performed 
a  strengths,  weaknesses,  opportunities,  and  threats 
analysis  (known  in  business  intelligence  parlance  as  a 
SWOT  analysis)  to  classify  their  observations. 

Strengths 

Our  Web-based  approach  was  seen  as  a  powerful 
tool  that  would  aid  management  in  identifying  sys¬ 
temic,  process-driven  root  causes  for  delays  and  other 
problems  and  that  had  the  potential  for  positive 
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Scheduling 


Figure  6.  A  spider  graph  of  the  number  of  delays  broken 
down  by  delay  type  and  plotted  for  the  entire  department  and 
the  section. 

effects  on  the  culture  of  the  organization.  Among  the 
positive  aspects  they  cited  were:  (1)  this  approach 
turns  traditional  paper-based  data  into  knowledge 
and  presents  this  knowledge  in  easy-to-digest 
chunks;  (2)  the  dashboard  is  independent  of  any  sin¬ 
gle  vendor;  (3)  if  used  with  a  data  repository  it  has 
the  potential  ability  to  link  data  from  different  infor¬ 
mation  systems;  (4)  it  provides  a  systemic  view  that 
can  calculate  the  total  costs  of  root  causes;  (5)  it 
provides  a  quick  visual  way  to  target  improvement; 

(6)  additional  metrics  can  be  added  at  the  request  of 
management  with  minimum  programming  effort; 

(7)  visual  displays  made  outliers  and  trends  more 
easily  identifiable  and  rendered  distributions  more 
easily  understood  than  standard  aggregate  statistics. 

Weaknesses 

The  reviewers  identified  4  areas  of  weakness  and 
potential  improvement  for  the  Web-based  system. 

Timeliness .  Depending  on  the  method  of  data  extrac¬ 
tion,  data  may  not  be  live  or  near  live.  If  data  are  pro¬ 
vided  via  an  upload,  they  will  be  only  as  recent  as  the 
last  event.  The  dashboard  optimally  should  have  an 
Open  Database  Connectivity  connection  to  a  clinical 
data  repository  (CDR)  or  similar  copy  of  the  live  pro¬ 
duction  environment.  The  update  schedule  of  the 
CDR  will  determine  the  timeliness  of  the  dashboard. 

Personnel  resources.  Skilled  personnel  are  required 
to  build  and  maintain  a  graphical  dashboarding 


application.  A  surgical  informaticist  is  a  good  choice 
as  they  have  the  clinical  domain  experience  combined 
with  the  principles  of  management  and  information 
technology.  This  person  needs  to  guide  the  develop¬ 
ment  of  metrics  using  clinical  knowledge  to  extract 
meaningful  and  relevant  data.  This  individual  can 
also  play  a  crucial  role  in  bridging  the  cultures  among 
health  care  providers,  information  technology  special¬ 
ists,  and  business  process  managers.26 

Hardware  resources.  Hardware  resources  include 
access  to  server  space  with  sufficient  processing 
power  and  storage  to  handle  a  large  database.  The 
database  storage  space  required  is  minimal;  how¬ 
ever,  the  central  processing  unit  that  drives  the  data 
mining  must  be  powerful. 

Management  training.  The  introduction  of  analytics 
and  acceptance  of  business  intelligence  practices 
within  a  group,  particularly  one  that  already  has  a 
long-engrained  operations  process,  cannot  be 
accomplished  overnight.11  An  investment  of  time 
and  effort  is  required  and  involves  education  of 
managers  on  the  use  of  these  tools  and  the  ways  in 
which  they  can  be  incorporated  into  decision  mak¬ 
ing.  In  the  process,  the  focus  of  the  managers  and 
the  entire  organization  should  change  from  trying  to 
understand  the  latest  event  to  looking  at  trends 
within  the  data  to  predict  what  will  happen  next  and 
to  identify  ways  to  achieve  the  best  possible  results. 

Opportunities 

A  dynamic  surgical  block  utilization  chart  with  eas¬ 
ily  available  drilldown,  as  created  in  our  project, 
allows  surgical  chiefs  to  continually  monitor  utiliza¬ 
tion.  The  drilldown  permits  them  to  see  which  days 
are  being  underutilized  and  by  whom  and  points  to 
immediate  courses  of  action  rather  than  waiting  for 
end-of-year  retrospective  and  analysis.  When  com¬ 
petition  is  fierce  for  OR  time,  this  transparency  can 
be  extraordinarily  valuable  for  surgical  practices. 

Another  potential  benefit  can  accrue  from 
matching  staffing  with  caseload  to  optimize  OR  effi¬ 
ciency.27  Cases  in  overutilized  time  are  1.75  times 
more  expensive  than  cases  during  normal  staffing 
hours,  the  goal  is  to  match  caseload  with  staff.28  The 
dashboard  tool  pulls  in  scheduling  data  from  the 
clinical  information  system  and  can  display  the  num¬ 
ber  of  projected  cases  at  1-hour  intervals.  The  dash¬ 
board  can  also  use  retrospective  data  on  add-on 
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cases  to  estimate  a  caseload  probability  by  the  hour. 
To  maximize  efficiency,  a  user  input  can  be  created 
whereby  a  manager  or  charge  nurse  may  enter  data 
on  staffing  levels,  helping  to  match  caseload  to 
staffing. 

Along  with  the  scheduling  efficiency,  OR  senior 
staff  and  managers  may  want  to  match  clinical  profi¬ 
ciencies  with  cases.  Displays  can  be  created  to  show 
circulator/scrub  combinations  with  surgical  special¬ 
ties  and  case  types  similar  to  the  surgeon— anesthesia 
graphs  presented.  This  gives  opportunities  to  the 
managers  to  maximize  good  teams  and  to  identify 
teams  that  need  improvement. 

Many  opportunities  are  available  for  bench¬ 
marking  between  services  and  between  organiza¬ 
tions.  The  only  limitation  is  the  ability  to  capture 
data  from  an  information  system  or  network,  to 
apply  a  meaningful  analysis,  and  to  provide  an  easily 
understood  graphic  for  the  appropriate  audience. 

Current  Procedure  Terminology  (CPT)  codes 
would  be  an  important  additional  piece  of  informa¬ 
tion  because  case  efficiency  should  be  benchmarked 
against  similar  cases.  For  example,  cardiac  thoracic 
cases  have  long  turnover  times  because  of  the  degree 
of  complexity  involved  in  setup  of  equipment,  drawing 
of  drugs,  patient  preparation,  etc.  National  bench¬ 
marking  can  be  imported  to  compare  against  the 
organization  s  benchmarks  reports  for  CPT  and  disease- 
related  groups,  morbidity  and  mortality,  length  of  stay, 
and  complications  and  can  be  presented  in  an  easy- 
to-navigate  and  easy-to-understand  visual. 

Opportunities  for  assessing  clinical  outcomes 
include  measures  such  as  infection  control,  preop¬ 
erative  antibiotic  compliance,  unplanned  returns  to 
the  OR,  staff  compliance  on  chart  quality,  timeli¬ 
ness,  completeness,  staff  arrival  time,  etc.  Financial 
reports  can  include  direct  costs,  indirect  costs,  con¬ 
tribution  margins  per  case/specialty,  labor  costs, 
supply  costs  per  case,  and  metrics  associated  with 
defining  and  monitoring  best  practices. 

Threats 

In  all,  2  potential  threats  to  a  system,  such  as  the 
one  we  devised  were  identified  by  the  interview 
group  and  by  our  own  developers. 

The  first  threat  is  in  the  area  of  data  quality  and 
integrity.  The  data  retrieved  for  our  clinical  informa¬ 
tion  system  have  2  sources  of  origin.  First,  schedul¬ 
ing  data  are  obtained.  These  include,  but  are  not 
limited  to,  scheduled  start  date  and  time,  duration, 


procedure,  surgeon,  and  anesthesiologist.  The  sched¬ 
ulers  at  our  institution  reside  both  centrally  in  a  sur¬ 
gical  posting  office  and  decentralized  in  physicians’ 
offices  (in  the  oral  maxillofacial  and  organ  transplant 
services).  Because  scheduling  data  do  not  directly 
enter  the  patient  s  medical  record  or  roll  directly  into 
clinical  documentation  that  must  be  reviewed  and 
modified  by  a  nurse,  it  is  assumed  that  the  risk  for 
bias  is  minimal.  Manipulation  of  case  durations  and 
scheduled  start  times  is  limited  by  system  controls. 
The  surgical  posting  office  does  have  the  ability  to 
override  system  data  (eg,  for  scheduled  case  duration, 
which  is  a  by-product  of  historical  averages),  but  this 
is  not  done  without  the  approval  from  a  supervisor. 

Data  from  nursing  documentation  is  under  con¬ 
stant  review  by  various  clinicians  to  audit  work.  This 
process  ensures  data  integrity  and  compliance  and 
serves  as  a  modest  check  and  balance.  Most  of  these 
data  are  objective,  and  although  some  bias  may  be 
present,  this  will  most  likely  be  minimal.  The  area  of 
documentation  that  is  most  prone  to  bias  is  the 
“delay  reason,”  because  of  its  highly  subjective 
nature  and  possible  repercussions  from  manage¬ 
ment.  Another  factor  for  inaccurate  delay  reporting 
is  the  phenomenon  of  cascading  delays  (ie,  when  a 
delay  early  in  the  day  causes  delays  in  subsequent 
cases).  By  the  time  of  the  third  or  fourth  delayed 
case,  it  is  difficult  to  ascertain  the  cause  other  than 
to  note  that  the  previous  case  “ran  over.”  During  this 
series  of  delays,  an  entirely  different  cause  of  delay 
may  happen  in  a  specific  case,  but  the  reference 
point  of  a  scheduled  start  time  is  lost,  so  that  it  is 
much  more  difficult  to  document  a  delay  cause  and 
duration.  Of  course,  time  stamps  (in-room  time 
minus  scheduled  start  time)  provide  well  docu¬ 
mented  and  precise  record  of  delay  in  minutes,  but 
this  does  not  qualify  delay  by  reason  type  and  pro¬ 
vides  no  insights  for  root  cause  analysis. 

The  second  threat  to  initiation  of  a  system  such 
as  the  one  we  developed  lies  in  the  general  percep¬ 
tions  by  staff  and  physicians.  Many  may  feel  that 
they  are  being  spied  upon  or  monitored,  especially  in 
areas  in  which  no  previous  metrics  existed.  Others 
may  find  themselves  out  of  their  routine  comfort 
zones.  Underperforming  staff  who  worry  that  they 
may  be  identified  by  the  system  may  aggressively 
resist  implementation  of  the  new  tools  or  work  to 
undermine  data  integrity.  Depending  on  the  organi¬ 
zation  s  structure,  the  open  availability  of  data  could 
result  in  punishment  for  individuals  or  a  group  rather 
than  the  intended  promotion  of  positive  departmental 
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and  institutional  change.  Moreover,  in  environments 
in  which  competition  for  OR  time  is  strong,  surgical 
chiefs  may  be  tempted  to  use  data  as  a  weapon  to 
promote  their  own  agendas. 

The  transparency  of  the  data  should  alleviate 
some  of  these  concerns.  Team  members  should  be 
able  to  see  the  data  in  which  performance  is  being 
judged.  In  the  past,  data  was  obtained  by  someone 
walking  around  with  a  clipboard  or  in  a  back  office 
recording  data  off  charts,  with  limited  or  no  ways  to 
verify  whether  the  data  were  true  and  accurate.  With 
the  drilldown  features  and  different  ways  of  organiz¬ 
ing  data  for  dashboard  display,  team  members  can 
easily  view  the  raw  data. 

Conclusion 

Strategic  decisions  made  based  on  the  management 
instinct  have  a  lasting  effect  on  the  well-being  of  an 
organization.  Management  could  benefit  from  the 
adoption  of  business  intelligence  tools  that  provide  a 
quantifiable,  validated  alternative  to  instinct  and  ad 
hoc  choices  decision  making. 

Behavior  and  practice  changes  are  central  to 
achieving  the  objective  of  quality  reports  that  drive 
efficiency.  Too  often,  data  are  not  integrated  within  the 
scope  of  daily  practice.  Acceptance  of  the  importance 
of  data  must  become  a  part  of  the  culture  of  the 
organization.  Graphical  dashboards  that  present  infor¬ 
mation  down  to  the  simplest,  easiest-to-understand, 
and  most  accurate  levels  can  compel  this  behavior 
change.  Managers  in  the  perioperative  environment 
should  seize  the  opportunity  to  integrate  data  into 
their  organizations’  cultures.  Our  research  suggests 
that  one  promising  approach  is  in  Web-based  tools  that 
can  be  made  for  targeted  audiences  and  adjusted  by 
role,  position,  or  location.  The  result  can  be  total 
participation  in  quality  improvement  and  constant 
feedback  that  provides  long-term  rewards  in  cost 
efficiencies,  staff  and  physician  satisfaction,  and 
improved  patient  outcomes. 
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Abstract.  The  University  of  Maryland  Medical  Center  and  School  of  Medicine 
have  sponsored  a  program  of  research  targeted  at  the  enabling  of  technologies  for 
enhanced  training,  clinical  effectiveness  and  patient  safety.  The  pillars  of  this 
research  included  scientific  approaches  related  to  Informatics,  Smart  Image, 
Simulation  and  Ergonomics  and  Human  Factors.  The  evolving  research  effort 
opened  the  door  to  a  revised  concept  of  basic  surgical  sciences  that  underpin 
training  and  performance  in  the  operative  environment. 

Keywords.  Surgery,  training,  innovation,  surgical  basic  sciences 


Background 

The  phrase,  operating  room  of  the  future  (ORF),  has  been  used  to  describe  the 
development  of  medical  technology  and  the  improvement  of  function  and  safety  of  the 
perioperative  environment.  The  research  program  in  the  Department  of  Surgery  at  the 
University  of  Maryland  has  extended  the  meaning  of  the  ORF  to  the  study  of  functions 
and  interactions  of  people,  processes  and  technology  producing  a  safe  and  efficient 
operating  suite. 


The  Research  Portfolio 

For  five  years,  the  University  of  Maryland  Medical  Center  and  School  of  Medicine 
have  sponsored  a  program  of  research  targeted  at  the  enabling  of  technologies  for 
enhanced  training,  clinical  effectiveness  and  patient  safety.  Initially,  under  the  rubric  of 
‘The  Operating  Room  of  the  Future”  various  pillars  of  research  were  established  that 
proposed  to  advance  the  state  of  medicine,  notably  surgery.  The  pillars  included 
scientific  approaches  related  to  Informatics,  Smart  Image,  and  Simulation.  The 
evolving  research  effort  opened  the  door  to  a  revised  concept  of  basic  surgical  sciences 
that  underpin  training  and  performance  in  the  operative  environment. 

Developments  led  to  two  important  changes;  the  adoption  of  a  new  mantra, 
Innovation  in  the  Surgical  Environment,  to  replace  the  Operating  Room  of  the  Future; 
and  the  addition  of  another  research  pillar,  that  of  Ergonomics  and  Human  Factors. 


Corresponding  Author:  Gerald  Moses,  MASTRI  Center,  Division  of  General  Surgery,  University  of 
Maryland  Medical  Center,  22  S.  Greene  Street,  Baltimore,  MD  21201;  gmoses@smail.umaryland.edu 
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Progress  has  been  achieved  in  each  of  the  pillars  of  research,  as  reported  at  a  recent 
annual  conference  that  sought  to  apply  lessons  learned  form  the  high-stakes 
environments  of  aviation  and  astronautics  to  the  practice  of  surgery. 


Research  Pillars 

The  medical  informatics  pillar  includes  a  Perioperative  Scheduling  Study,  a  study  of 
workflow  around  performance  indicators  in  the  peri-operative  environment  and 
building  a  graphical  dashboard  to  allow  data  mining  and  trend  analysis  of  operating 
indicators.  The  surgical  simulation  pillar  entails  both  physical  and  cognitive  simulation 
for  training  with  emphasis  upon  laparoscopic  surgery.  A  third  pillar  is  entitled  “Smart 
Image”  in  which  we  are  seeking  to  push  the  boundaries  of  real  time  deformable  image 
registration  with  a  goal  of  performing  the  1st  fully  smart  image  guided  laparoscopy.  A 
recently  added  pillar  of  ergonomics  and  human  factors  addresses  the  impact  of  stress 
movements  and  position  upon  the  surgeon  performing  minimally  invasive  or  “open” 
procedures. 

Informatics:  Workflow  and  Operations  Research  for  Quality  (WORQ) 

The  Perioperative  Scheduling  Study  is  looking  at  how  using  post-operative  destination 
information  during  the  process  of  surgery  scheduling  can  influence  congestion  in  post¬ 
operative  units  such  as  intensive  care  units  (ICUs)  and  intermediate  care  units  (IMCs), 
which  lead  to  overnight  boarders  in  the  post-anesthesia  care  unit  (PACU).  We  have 
developed  a  mathematical  congestion  evaluation  model  for  evaluating  congestion  in 
post-operative  units,  including  ICUs,  IMCs,  and  floor  units.  This  model  requires  data 
about  post- operative  destinations  and  length-of-stay  distributions  for  different  types  of 
surgeries.  We  have  analyzed  data  about  cardiac  surgeries  from  two  years  and  have 
analyzed  UMMC  financial  records  for  all  of  the  surgical  cases  for  fiscal  year  2007.  We 
have  developed  an  algorithm  for  predicting  bed  requirements  based  on  the  surgical 
schedule  and  have  conducted  a  preliminary  study  comparing  these  predictions  to  other 
prediction  methods  for  two  units.  The  preliminary  results  show  that  the  new  bed 
requirements  prediction  method  is  more  accurate. 

Informatics:  Operating  Room  Glitch  Analysis  (OGA) 

The  OGA  project,  focusing  on  institutional  learning,  is  looking  at  the  workflow  around 
performance  indicators  in  the  peri-operative  environment  and  building  a  graphical 
dashboard  to  allow  data  mining  and  trend  analysis  of  operating  indicators. 

We  have  integrated  into  the  data  architecture  a  javascript  based  bubble  chart  that 
provides  several  interactive  features  to  allow  thorough  data  discovery.  The  bubble  chart 
can  play  over  time  to  see  how  the  size  of  the  bubbles  change,  which  relates  to  the 
number  of  cases  performed,  as  well  as  their  x  and  y  axis  location.  The  x  and  y  axis  can 
represent  delay  duration,  actual  procedure  time,  scheduled  procedure  time,  or  turnover 
time.  The  bubble  can  also  be  tagged  to  provide  a  contrail  to  show  performance  over 
time.  Figure  1  indicates  an  analysis  of  service  delay  as  related  to  average  length  of 
surgical  procedure. 
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Service  Delay  Analysis 


Figure  1.  Service  delay  as  related  to  average  length  of  surgical  procedure. 


Informatics:  Video  Summarization  of  Key  Events  in  Surgery 

The  technique  of  summarization  is  used  when  confronted  with  the  task  of  gleaning 
succinct  information  from  large  amounts  of  data.  For  example,  our  national 
intelligence  services  use  both  machine  and  human  analysis  to  prepare  the  daily 
Intelligence  Summary  for  the  President.  A  similar  challenge  is  presented  to  those  who 
train  surgeons  using  a  vast  archive  of  surgical  video.  A  key  element  in  teaching  is  the 
extraction  of  the  right  video  event  to  make  the  critical  point  to  surgical  trainees. 

Recent  decades  have  seen  an  increasing  use  of  VR  and  simulation  aids  in  surgical 
training.  The  typical  approach  is  to  use  sensors  to  capture  the  kinematics  of  the  tools, 
as  well  as  force/torque  measures.  One  thread  of  work  directly  analyzes  these 
measurements  to  construct  Markov  Models  that  describe  the  state  and  transitions  for  a 
surgical  procedure,  and  it  is  then  shown  that  the  transition  probabilities  between  states 
are  different  at  different  levels  of  expertise. 

An  alternative  approach  is  for  an  expert  to  look  at  the  video  of  the  surgical 
procedure  (or  training),  identify  key  steps/events  (either  done  well  or  incorrectly),  and 
then  judge  the  skill  level  of  the  performer.  This  approach  can  bring  to  bear  the  expert’s 
knowledge  and  intuition  of  the  complex  interaction  between  tools,  movements,  organs, 
cutting  planes  etc.  The  drawback  however  is  that  it  requires  the  review  of  a  video  that 
can  be  very  time-consuming.  We  propose  to  address  this  problem  by  developing 
techniques  to  automatically  identify  key  scenes/events  in  a  video  of  laparoscopic 
surgeries. 
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Simulation 


We  are  conducting  multiple  studies  of  the  effects  of  physical  box  trainers,  virtual 
reality  (VR)  trainers,  and  mixed  modality  training  for  acquiring  laparoscopic  surgery 
skills.  These  studies  support  the  actions  and  operations  of  the  Maryland  Advanced 
Simulation  Training,  Research  and  Innovation  (MASTRI)  center.  Additionally,  we  are 
developing  a  cognitive  simulator  and  building  knowledge  representations  based  on 
ontology  of  focused  human  anatomy/physiology  to  emulate  the  surgical/clinical 
experience.  The  cognitive  simulator,  the  Maryland  Virtual  Patient,  has  been  developed 
by  construction  of  a  computational  model  of  the  cognitive  agent,  and  by  testing  the 
goal-  and  plan-based  reasoning  component  and  its  interaction  with  the  interoceptive 
and  language  perception  modules  and  verbal,  mental  and  physical  action  simulation 
modules. 

We  have  continued  to  work  on  the  natural  language  substrate  of  the  system, 
concentrating  on  enhancements  required  for  processing  dialog  (not  expository  text). 
Further,  we  have  implemented  an  enhanced  microtheory  of  indirect  speech  acts,  and 
continued  working  reference  resolution  algorithms. 

The  research  work  encompasses  work  targeted  upon  the  acquisition  of  ontology 
and  lexicon  knowledge.,  and  improvement  of  the  DEKADE  user  interface.  The  current 
version  of  the  cognitive  simulation  system  includes  multiple  scenarios  of  physician- 
patient  interface  related  to  LERD/GERD  patient  conditions. 

Smart  Imaging 

Surgical  practice  is  considered  among  the  most  complex  and  difficult  fields.  That  no 
two  patients  are  exactly  alike  is  one  of  the  challenges  that  make  it  so.  Anatomic  and 
physiologic  differences  make  each  case  unique.  In  surgery,  these  variations  can 
complicate  an  operation;  the  discovery  of  unexpected  anatomical  variations  often 
requires  a  surgeon  to  stray  from  standard,  well-practiced  techniques  to  attempt  a  novel 
approach  to  the  procedure.  With  novelty  comes  a  reduced  margin  of  safety.  This 
situation  is  exacerbated  by  a  trend  toward  further  physical  separation  between  the 
patient  and  interventionalists  (e.g.,  surgeons,  endoscopists,  radiologists)  and  a  greater 
dependence  on  an  image  of  the  patient’s  (target)  anatomy  to  effect  therapy  or  establish 
a  diagnosis. 

“Smart  image,”  as  we  have  defined  it,  refers  either  to  the  process  of  extracting 
elements  from  an  environment  and  imparting  them  to  an  image  or  to  acquiring 
elements  from  within  a  scene  and  enhancing  them.  The  result  in  either  case  is  a  more 
meaningful  visualization  of  the  operative  field.  Although  many  applications  exist 
within  this  definition,  Maryland’s  smart  image  team  is  working  toward  performing  the 
first  laparoscopic  surgery  guided  completely  by  smart  image. 

Typically  in  laparoscopic  procedures,  diagnostic  imaging — including  x-rays, 
computerized  tomography  (CT),  and  magnetic  resonance  imaging  (MRI)  scans — can 
provide  a  preview  of  patient  physiology.  Often,  however,  these  diagnostic  images  are 
in  a  static  format  that  does  not  allow  the  care  provider  to  interact  meaningfully  with  the 
information  the  images  contain.  Current  advances  in  smart  imaging  can  be  used  to 
improve  patient  safety  by  providing  the  caregiver  with  a  more  interactive  experience.  A 
set  of  two-dimensional  (2D)  slices  of  a  CT  scan  can  be  transformed  into  a  three- 
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dimensional  (3D)  computer  model  so  that  surgeons  can  preview  a  realistic  view  of  the 
patient’s  anatomy  before  an  operation.  This  type  of  smart  imaging  provides  an 
interactive  “fly-through”  view  that  allows  the  surgeon  to  explore  the  anatomy  in  detail. 

With  advances  in  computing  power,  these  previews  could  be  mapped  more 
realistically  to  interactive  simulators  that  would  permit  rehearsal  of  a  surgical 
procedure  that  might  include  attempts  at  novel  approaches  before  surgery  begins. 
During  real  surgery,  these  smart  diagnostic  images  could  be  integrated  into  the 
surgeon’s  actual  view  of  the  patient. 

We  are  working  toward  matching  the  minimally  invasive  surgeon’s  video  view  of 
the  surface  anatomy  with  computer-generated  models  from  Digital  Imaging  and 
Communications  in  Medicine  (DICOM)  data  sets.  Such  imaging  could  provide  the 
surgeon  with  real-time  “x-ray  vision”  during  the  operation.  Thus,  the  underlying 
structure,  such  as  the  position  of  a  tumor  beneath  the  surface  of  a  larger  anatomic 
structure  or  blood  vessels  within  the  liver,  could  be  seen.  Vessels  could  be  contrast- 
enhanced  in  a  single,  high-resolution  CT  scan  before  the  surgery.  Then,  during  surgery, 
low-dose/low-resolution  CT  scans  could  be  used  to  transform  the  high-resolution  CT 
image  to  match  the  movement  of  the  patient’s  anatomy  during  surgery.  This  would 
allow  intraoperative  visualization  of  anatomy  that  retains  the  enhanced  contrast  vessels, 
a  unique  ability  that  is  not  possible  at  present. 

CT  scans  can  provide  enhanced  intraoperative  visualization  of  deep  structures  far 
superior  to  that  of  laparoscopes.  However,  the  use  of  continuous  CT  exposes  the  patient 
and  surgeon  to  a  radiation  level  that  remains  a  concern.  Therefore,  a  major  thrust  of  our 
work  is  to  design,  develop,  and  test  several  dose-reduction  strategies  and  to  incorporate 
these  into  our  proposed  continuous  CT-guided  surgical  navigation  system.  Our 
preliminary  work  suggests  that  our  strategies  would  allow  us  to  lower  the  net  radiation 
exposure  to  the  patient  to  levels  commonly  viewed  as  safer  in  cardiac  catheterization 
and  interventional  radiology  procedures.  In  the  long  term,  we  also  propose  using 
telemanipulators  to  remove  surgeons  from  the  CT  room  and  thereby  shield  them 
entirely  from  radiation  exposure  while  they  are  performing  the  procedure. 

Ergonomics  and  Human  Factors 

Recently,  a  fourth  pillar  was  added  to  our  research  portfolio,  that  of  Ergonomics  and 
Human  Factors.  These  are  two  related  branches  of  study  that  examine  the  relationship 
between  people  and  their  work  environment.  Ergonomics  often  focuses  on  the  physical 
environment  and  the  human  body,  while  human  factors  center  more  on  the  cognitive 
aspects  of  performance.  The  same  ergonomics  and  human  factors  techniques  credited 
with  making  industrial  processes  safer  and  more  efficient  can  be  applied  to  the  analysis 
and  improvement  of  OR  operations.  Tools,  such  as  video  analysis  and  motion  tracking, 
can  be  used  to  analyze  current  practices,  identify  inefficiencies  and  dangers,  develop 
solutions,  and  measure  improvement.  “Best  practices”  to  maximize  safety  and 
efficiency  can  be  developed  based  on  empirical  data. 

Our  discussion  of  workflow  to  this  point  has  taken  a  macro  or  panoramic  view;  for 
example,  how  might  we  most  effectively  track  and  bring  together  the  people  and  assets 
necessary  to  ensure  that  a  patient’s  surgical  experience  is  safe  and  efficient.  Through 
human  factors  and  ergonomics,  we  have  the  ability  to  focus  on  a  more  micro-level 
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analysis,  such  as  measurements  of  surgeon/instruments  interface  and  how  the  physical 
interface  between  the  surgeon  and  the  patient  could  be  improved. 

In  the  future,  OR  workspace  layout  would  be  optimized  through  ergonomic  data 
and  human  factors  analysis,  and  this  optimization  would  lead  to  the  establishment  of 
“best  practices”  for  an  array  of  surgical  operations.  Proper  layout  would  reduce  risks  of 
infection,  speed  operations,  and  reduce  fatigue  of  surgeons  and  staff,  all  elements  that 
could  contribute  to  a  reduction  in  adverse  events  and  improved  patient  safety. 


Future  Vision  of  the  Operating  Room  Environment 

Well-trained  care  providers,  who  have  reached  a  level  of  proficiency  on  realistically 
simulated  patients,  are  supported  by  an  array  of  smart  technology  enabling  surgical 
procedures  to  be  performed  in  an  ever  safer  environment.  Cases  start  on  time  with  all 
team  members  informed  of  the  goals  and  possible  trouble  spots  of  each  operation. 
Contingency  plans  are  in  place  for  dealing  with  anticipated  complications.  The  smart 
environment  checks  that  all  required  equipment  and  people  are  present  and  cross¬ 
checks  drugs  and  blood  products  brought  into  the  room,  ensuring  patient  compatibility 
in  terms  of  allergies  and  blood  type.  Surgeons  do  not  have  to  fight  fatigue  and 
discomfort  during  surgery,  as  the  layout  of  the  surgical  workspace  is  ergonomically 
correct.  Thus,  the  time  and  effort  needed  to  perform  surgery  is  minimized  and 
improvement  of  both  technique  and  outcomes  is  realized. 


A  New  Set  of  Basic  Surgical  Sciences 

The  potential  of  surgical  care  in  the  future  can  be  realized  by  incorporating  into  the 
training  of  surgeons  a  new  set  of  basic  surgical  sciences,  those  of  advanced  imaging, 
informatics  systems,  simulation  and  ergonomics  and  human  factors.  These  do  not 
replace  the  well  established  scientific  bases  of  anatomy,  physiology,  pathology  and 
related  areas  of  study.  Rather  they  add  a  vital  underpinning  to  the  knowledge  and 
expertise  required  of  future  practitioners. 
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Though  in  its  infancy,  the  discipline  of  surgical  ergonom¬ 
ics  is  increasingly  valued.  Still,  little  has  been  written 
regarding  this  fields  tasks,  models,  and  measurement 
systems.  These  3  critical  experimental  components  are 
crucial  in  objectively  and  accurately  assessing  joint  and 
postural  control  as  exhibited  by  expert  laparoscopic  sur¬ 
geons.  Such  assessments  will  establish  characteristic  pat¬ 
terns  important  for  surgical  training.  In  addition,  risk 
factors  associated  with  both  minimally  invasive  surgical 
instruments  and  the  operating  room  environment  can  be 
identified  and  minimized.  Our  review  focuses  on  evidence- 
based  experimental  ergonomic  studies  undertaken  in 


the  field  of  laparoscopic  surgery.  Publications  were 
located  through  PubMed  and  other  database  and  library 
searches.  This  article  describes  tasks,  models,  and  meas¬ 
urement  systems  and  considers  their  specific  applica¬ 
tions  and  the  types  of  data  obtainable  with  the  use  of 
each.  Advantages  and  limitations,  especially  those  of 
measurement  systems,  are  compared  and  discussed. 
Future  trends  and  directions  believed  necessary  for 
optimal  investigation  and  results  are  also  addressed. 

Keywords:  surgical  ergonomics;  methodology;  meas¬ 
urement  systems;  laparoscopy;  review 


The  rapid  acceptance  of  laparoscopic  surgery 
as  a  clinical  alternative  to  traditional  open 
surgery  sometimes  obscures  the  fact  that 
minimally  invasive  surgery  (MIS)  is  still  relatively 
new.  Newer  still  is  our  realization  of  the  physical 
demands  MIS  can  make  on  its  practitioners,  but  those 
are  precisely  the  issues  that  the  very  recent  discipline 
of  surgical  ergonomics  seeks  to  understand  and  address. 
Like  all  disciplines  that  start  out  at  once  self- 
contained  yet  also  multidisciplinary,  the  field  of 
MIS  ergonomic  research  is  fast  becoming  broadened 
through  a  variety  of  different  approaches,  so  numer¬ 
ous  that  those  outside  the  field  as  well  as  many 
inside  the  field  may  possess  scant  knowledge  regard¬ 
ing  the  vast  methodologic  array  of  tasks,  models,  and 
assessments  in  use. 

Ergonomics  examines  and  seeks  to  minimize  risk 
factors  between  human  beings  and  the  tasks  and 
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environments  that  occupy  them.  Its  historical  origins 
are  traceable  to  two  editions  (1700  and  1713)  of  De 
Morbis  Artificum  Diatriba,  or  Diseases  of  Workers, 
in  which  the  job  hazards  workers  encountered  were 
characterized  by  Bernardo  Ramazzini,  often  termed 
occupational  medicines  founder,1  and  also  to  the 
1857  article,  “An  essay  on  ergonomy,  or  science  of 
labour,  based  on  the  laws  of  natural  science,”  by  W.B. 
Jastrzebowski,  who  is  credited  with  the  first  use  of 
the  term  derived  from  coupling  the  Greek  words  for 
work  (“ergon”)  and  for  natural  law  (“nomos”).2  Despite 
those  and  other  early  influences,  the  science  of 
ergonomics  is  still  regarded  as  a  fairly  recent  disci¬ 
pline,  one  just  shy  of  its  60th  anniversary  as  a  formal 
body  of  knowledge.3 

The  relative  youthfulness  of  ergonomics  belies 
the  significance  of  the  contributions  it  has  made  in 
many  occupational  fields,  including  the  military, 
athletics,  and  medicine,  in  addition  to  other  envi¬ 
ronments.  Fatigue,  stress,  and  equipment  use  as  factors 
affecting  task  performance  were  of  great  interest  to 
the  military  during  World  War  II.  Indeed,  to  study 
the  strain  that  might  be  experienced  by  air  personnel 
flying  long-range  missions,  a  simulated  cockpit  was 
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built  by  Kenneth  Craik,  inaugural  director  of  the 
Applied  Psychology  Research  Unit  (Cambridge).4 
Among  the  many  sports  benefiting  from  ergonomic 
applications  is  track  and  field.  Proper  shoe  design 
and  material  composition  can  greatly  reduce  fatigue 
level  and  improve  running  performance.5  “Dispersion 
of  attentional  resources” — a  unique  measure  recently 
established  within  the  anesthesia  work  domain  for 
evaluating  the  workplace  awkwardness  that  results 
from  simultaneous  task  performance — is  typical  of 
knowledge  ascertained  through  ergonomic  clinical 
research.6  Ergonomic  theory  design,  and  applications 
play  an  important  role  in  everyday  life,  as  is  evidenced 
by  accumulating  research  on  everything  from  the 
effect  of  computer  keyboard  designs  on  users  with 
upper  extremity  musculoskeletal  disorders7  to  the 
collection  and  analysis,  using  motion  capture  and 
pressure  sensors,  of  posture  parameters  and  sitting 
strategies  to  determine  car  seat  design  that  is  both 
comfortable  and  ergonomically  sound.8 

The  study  of  ergonomics  in  the  surgical  arena  has 
acquired  increased  importance  with  the  advent  and 
widespread  acceptance  of  laparoscopic  procedures. 
Overall,  the  case  has  been  made  that  its  advantages 
often  make  MIS  the  preferable  alternative  for  patients. 
Specifically,  it  falls  to  surgical  ergonomics  to  make  the 
case  through  problem  definition,  research  surveys 
and  studies,  and  data  acquisition  and  analysis  that 
MIS  is  often  a  demanding  alternative  for  its  surgical 
practitioners.  This  phenomenon  results  primarily  from 
the  different  nature  of  the  minimally  invasive  surgical 
environment,  which  surfaces  new  issues  for  the  surgeon 
whose  access  to  information  and  ability  to  move  are 
limited  in  particular  ways. 

Though  literature  reviews  of  ergonomics  in  mini¬ 
mally  invasive  surgery  have  been  undertaken,  they 
remain  few  in  number.  In  explanation  of  a  newly 
coined  term — minimal-access  surgery  (MAS)-related 
surgeon  morbidity  syndromes — a  lengthy  review  was 
undertaken  covering  a  broad  array  of  issues  identified 
as  problematic,  such  as  instrument  design,  operative 
display  systems,  and  access  ports  in  addition  to  injury 
mechanisms  resulting  from  procedural  technique.9 
A  limited  review  covered  ergonomic  laparoscopic 
research  accomplished  within  3  broad  categories: 
physical,  sensorial,  and  cognitive.10  A  more  compre¬ 
hensive  review  detailed  the  variety  of  MIS  ergonomic 
studies  on  visualization,  manipulation,  posture,  and 
workload  (mental  and  physical),  as  well  as  on  the 
operating  environment  overall.11  The  effects  of  visual 
and  haptic  perception,  often  reduced,  on  surgical 


performance  are  reviewed  in  an  article  that  also  inves¬ 
tigates  research  on  force.12 

In  this  review  we  focus  on  the  methodology  specif¬ 
ically  used  within  the  MIS  ergonomic  discipline  for 
collection  and  analysis  of  data.  In  doing  so,  we  exam¬ 
ine  and  discuss  the  tasks  and  models  germane  to 
studies  within  this  field.  Our  intention  is  to  reveal 
the  depth  and  breadth  of  the  research  approaches 
used  at  this  young  stage  of  MIS  ergonomics,  partic¬ 
ularly  through  review  and  discussion  of  tasks,  models, 
and  measurements,  and  to  suggest  how  improvement 
and  development  of  these  essential  factors  will  pro¬ 
foundly  affect  future  research  and  outcomes. 

Tasks  and  Models 

Two  crucial  design  components  in  surgical  ergonomic 
research  are  tasks  and  models.  As  laparoscopy  evolves 
into  a  well-accepted,  established  discipline,  the  tasks 
involved  in  MIS  procedures  have  increasingly  become 
more  standardized,  gaining  acceptance  from  MIS 
professional  and  accrediting  organizations.13'17  The 
result  is  that  these  tasks,  for  example,  Fundamentals 
of  Laparoscopic  Skills  (FLS),  are  increasingly  used 
in  MIS  ergonomic  research.  Additionally,  tasks  such 
as  partial  circle-cutting  have  been  created  specifi¬ 
cally  to  facilitate  surgical  ergonomic  research.  The 
term  “models”  refers  to  physical  forms — in  MIS  often 
animal,  artificially  made,  or  simulated — that  are  meant 
to  provide  a  realistic  approximation  of  operating  room 
(OR)  conditions  and  human  anatomy. 

Static  Tasks 

Static  tasks  involve  no  motion.  An  example  of  a  static 
task  is  the  opening  and  closing  of  an  instrument 
against  a  spring-loaded  clip  at  a  set  resistance18,19 
while  the  instrument  is  held  in  a  fixed  position.20 

Simple  Navigation  Tasks 

Although  they  do  not  simulate  any  particular 
operative  procedure,  these  tasks  involve  joint  move¬ 
ment  and  instrument  manipulation,  thus  permitting 
measurement.  Through  a  variety  of  navigational 
skill  tasks,  fairly  simple  outcome  metrics,  including 
time  and  error,  can  be  derived.  These  types  of 
tracking  tasks  include  navigation  around  an  electri¬ 
fied  wire  course21  and  simple  touching  of  labeled 
points  on  a  target.22,23 
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Computerized  formats  of  these  simple  navigation 
tasks  exist.  For  example,  the  Dundee  Endoscopic 
Psychomotor  Tester  (DEPT)  evaluates  the  users  abil¬ 
ity  to  navigate  a  5 -mm  probe  with  one  hand  through 
a  series  of  holes  on  a  target  plate,  culminating  in 
touching  a  plate  behind  the  target  plate.24,25  Its  suc¬ 
cessor,  the  Advanced  Dundee  Endoscopic  Psychomotor 
Tester  (ADEPT),  evaluates  the  subject  using  both 
hands  simultaneously — using  one  of  2  standard 
laparoscopic  instruments  with  one  hand  to  execute 
the  original  task,  while  the  other  manipulates  the 
target.26  In  addition  to  the  metrics  of  execution  time, 
number  of  errors,  and  successful  task  completion, 
ADEPT  is  capable  of  measuring  flight  trajectory  by 
recording  instrument  positions  in  3D  space. 

Manipulation  Tasks 

The  most  fundamental  laparoscopic  skills,  such  as 
object  manipulation,  suturing,  and  cutting,  comprise 
this  group  of  tasks,  which  require  bimanual  coordi¬ 
nation.  Most  studies  have  used  as  their  model  some 
form  of  standard  laparoscopic  trainer  box,  as  doing 
so  makes  research  less  reliant  on  proprietary,  often 
expensive  or  inaccessible  technology.  For  example, 
small  object  manipulation  into  a  small  aperture, 
instrument-to-instrument  rope  passing,  and  cable 
tying  have  been  used  as  a  representative  series  of 
tasks  for  assessing  laparoscopic  muscle  activation.27 
Other  groups  testing  similar  skills  have  used  tasks 
involving  transfers,  shape  cutting,  point  touching, 
and  needle  handling.28'30  Cutting  partial  circles  has 
been  used  for  assessing  operative  table  height31  and 
monitor  height.32  A  more  difficult  task  that  required 
lifting  and  cutting  of  various  threads  from  a  foam 
board33  was  used  to  examine  the  effects  of  2D-  versus 
3D-viewing  technology  on  task  performance. 

The  most  commonly  used  manipulation  tasks 
have  been  variations  of  suturing  and  tying.  Such  study 
tasks  and  models  have  ranged  from  tying  of  knots 
only,34'40  to  suture  placement  and  tying  on  an  artificial 
surface,41'44  to  suturing  of  porcine  enterotomies.45'50 
As  much  as  possible,  objective  metrics  have  been 
developed  for  these  types  of  tasks.  For  example, 
Cuschieri  and  colleagues  developed  a  knot  quality 
score  related  to  the  knot  breaking  or  slipping  force 
and  the  strength  of  the  suture  material  itself.51 
Suturing  tasks  with  their  specific  marked  targets  are 
easily  graded  based  on  accuracy  of  suture  placement 
and  time  to  completion.  Enterotomy  closures  have 
been  graded  based  on  leak  pressure. 


Traditional  Training  Boxes 

The  traditional  training  box  comes  in  various  forms, 
but  its  basic  construct  remains  the  same.  It  is  a 
confined  space  roughly  approximating  the  abdominal 
cavity  into  which  ports  may  be  placed  to  allow 
instrument  access.  Visualization  is  achieved  through 
a  camera,  which  can  be  a  laparoscope  or  simple 
charge-coupled  device  (CCD)  camera.  A  multitude 
of  different  tasks  such  as  object  transfer,  circle 
cutting,  and  bowel  suturing  can  be  conducted  within 
this  environment. 

Virtual  Reality  Models 

Virtual  reality  (VR)  simulators  are  becoming  increas¬ 
ingly  lifelike  as  they  incorporate  haptic  feedback  and 
actual  laparoscopic  instruments,  among  other  elements. 
They  also  come  advantageously  incorporated  with  a 
sizable  number  of  reproducible  and  standardized 
measures,  as  well  as  a  wide  range  of  possible  exer¬ 
cises.  The  simulated  tasks  used  so  far  have  not  been 
operative  simulations  per  se  but  instead  have  been 
simulations  of  drills  designed  to  develop  laparoscopic 
skills.  For  example,  a  needle  manipulation  drill  on  a 
LapSim  (Surgical  Science,  Gothenburg,  Sweden)  has 
recently  been  used  to  study  the  potential  benefit  of 
armrests.52  MIST-VR  (Medical  Education  Technologies, 
Inc.,  Sarasota,  Florida),  with  its  variety  of  simulated 
tasks,  has  been  used  to  measure  the  effect  of  cogni¬ 
tive  distraction53  and  physical  workload,54  and  the 
clipping  and  cutting  tasks  simulated  by  the  Xitact 
500  LS  (Xitact  SA,  Morges,  Switzerland)  have  been 
used  to  assess  instrument  handle  types.55  Despite  the 
possibilities  and  potentials  offered,  no  single  simula¬ 
tor  has  gained  widespread  use  in  ergonomic  analysis. 
In  addition,  to  be  proposed  as  optimal  models  for  use 
in  ergonomic  studies,  VR  simulators  must  improve 
overall  in  terms  of  the  still  generally  unrealistic  qual¬ 
ity  of  their  replicated  visual  and  haptic  feedback, 
variables  that  may  result  in  subjects  making  move¬ 
ments  that  would  not  be  made  in  real  or  more  realistic 
circumstances. 

Intraoperative  Model 

As  an  ergonomic  research  environment  in  which 
to  conduct  task  and  movement  research,  the  OR  is 
completely  realistic  and  completely  validated.  The 
drawbacks  to  use  of  the  OR  for  such  research,  however, 
are  considerable.  Difficulties  in  getting  ergonomic 
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equipment  into  the  OR  environment  have  barely 
been  overcome.  Two  formidable  issues  encountered 
are  sterilization  and  cumbersome  wiring.  Another 
aspect  that  must  be  considered  is  that  such  research 
within  the  OR  would  have  to  be  properly  limited  to 
the  expert  as  the  subject,  thus  excluding  comparative 
data  on  trainees  or  those  with  less  experience.  These 
obstacles  have  not,  however,  prevented  all  intra¬ 
operative  ergonomic  research.  Surgeon  posture,56 
camera  operator  movement,57  and  hand  movements58 
have  all  been  recorded  during  laparoscopic  chole¬ 
cystectomies.  Other  ergonomic  studies,  unhindered 
by  OR  limitations,  have  investigated  display  system,59 
operative  flow,  and  procedures.60'62 

Discussion 

Task  and  model  complexity  in  MIS  ergonomic  studies 
vary  depending  on  the  goal  of  each  experiment.  The 
data  measured  from  simple  static  or  manipulation 
tasks  are  relatively  easy  to  analyze  and  provide  basic 
ergonomic  or  performance  information.  With  data 
obtained  from  more  complex  tasks  that  provide  more 
comprehensive  information,  such  as  body  control 
strategies,  the  experiment  must  be  very  carefully 
designed  and  performed  to  minimize  variability  within 
and  between  subjects.  For  instance,  laparoscopic 
suturing  is  a  widely  used  task  in  surgical  ergonomic 
research.  Although  this  is  an  essential  skill,  the 
performance  of  which  is  unofficially  acknowledged 
as  the  hallmark  of  a  skilled  laparoscopic  surgeon,  it 
seldom  represents  more  than  a  small  fraction  of  the 
time  spent  in  a  laparoscopic  procedure.  Maneuvers 
such  as  exposure,  retraction,  and  dissection  are  far 
more  frequently  employed  tasks  requiring  more 
extended  time  expenditure,  yet  they  are  not  often 
the  subject  of  ergonomic  analysis.  Tasks  such  as  these 
must  be  deemed  worthy  of  research  consideration. 
Additionally,  more  models  and  tasks  for  surgical 
ergonomic  studies  should  be  standardized  in  the 
manner  of  the  FLS. 

Sophisticated  measurement  systems  built  from  a 
variety  of  technologies  determine  what  types  of  data 
are  obtainable  through  evaluative  use  of  tasks  and 
models.  Among  the  most  widely  used  are  motion 
analysis  for  capture  of  body  movement  patterns  and 
electromyography  (EMG)  analysis  for  monitoring  mus¬ 
cle  activation,  although  force  plate  analysis  for  evalu¬ 
ation  of  postural  stability  is  gaining  in  importance. 


Motion  Analysis 

Human  movement  is  the  result  of  complex  processes 
involving  the  brain,  spinal  cord,  peripheral  nerves, 
muscles,  bones,  and  joints.  Motion  capture  technol¬ 
ogy,  which  evolved  continuously  from  basic  photog¬ 
raphy  to  sophisticated  computer-aided  motion  analysis 
systems,  allows  study  of  these  complex  processes 
through  analysis  of  kinematic  data  that  represent 
the  relative  movements  of  segments  connected  with 
rotating  joints.  Motion  capture  systems  have  been 
used  in  various  fields  of  application,  including  gait 
analysis,63’64  sports  science  and  athletic  training,65’66 
biomechanics  and  neuroscience  research,67'69  and 
film  and  animation  production.70’71  Surgical  ergonom¬ 
ics  in  laparoscopy  is  a  relatively  new  field  employing 
motion  analysis.  A  wide  variety  of  movements,  rang¬ 
ing  from  the  maneuvering  of  surgical  instruments  to 
the  upper  and/or  lower  body  movements  of  sur¬ 
geons,  are  captured  as  kinematic  data  and  analyzed 
to  provide  detailed  information  about  what  constitutes 
ergonomic  safety  for  surgeons  performing  MIS  tasks 
and  procedures. 

Technologies 

Throughout  the  development  of  motion  analysis 
systems,  different  technologies  have  been  used  for 
accurate,  objective  movement  measurements.  Having 
considered  all  motion  capture  technologies,  includ¬ 
ing  ultrasound  tracking  and  electric/mechanical 
goniometers,  we  posit  that  the  most  representative 
technologies,  as  evidenced  by  research  group  use, 
are  orientation,  electromagnetic,  optical,  and  video- 
based  motion  analysis  systems  (Table  1). 

An  orientation  sensor  system  measures  3  angular 
movements — yaw,  pitch,  and  roll.  Electromagnetic 
tracking  systems  use  Faradays  law  to  measure  the 
orientation  and  location  data  of  each  sensor.  With 
optical  motion  capture  systems  (Figure  1),  light- 
emitting  diode  (LED)  strobes  surrounding  a  camera 
lens  transmit  light  into  a  measurement  space  from 
within  which  retroreflective  markers — placed  to  rep¬ 
resent  joint  and  segmental  landmarks — reflect  light 
back  to  the  camera.  Recent  optical  motion  capture 
systems  are  capable  of  handling  hundreds  of  markers 
and  reconstructing  body  movements  in  real  time. 
With  these  relatively  new  systems,  movement  at  each 
joint  can  be  calculated  in  3  directions  (flexion/extension, 
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Table  1 .  Motion  Analysis  Technologies  in  Surgical  Ergonomics 
Technology  User  groups  Components 


Orientation  sensor  Berguer  et  al,31  Smith  et  al, 32,54  Kondraske  et  al73 

Electromagnetic  tracking  Huber  et  al,33  Rasmus  et  al,57  Dosis  et  al, 58,83 

Ridgway  et  al,75  Datta  et  al,74  Mackay  et  al,76 
Bann  et  al,77  Khan  et  al,78  Moorthy  et  al,79, 81 
Hernandez  et  al,80  Munz  et  al,82  Aggarwal 
et  al,84  Smith  et  al85 

Optical  tracking  Emam  et  al,35,36, 38, 47-49  Person  et  al,56  van  Veelen 

et  al,59  Lee  et  al,72, 88,89  Patil  et  al,86  Gillette  et  al87 
Video  analysis  van  Veelen  et  al,29, 40  Matern  et  al,30, 90  Joice  et  al45, 62 


Solid-state  3-axis  pitch,  roll,  and  yaw  sensor 
Electromagnetic  field  generator  and  receivers 


Computer-controlled  video  camera  and 
retroreflective  markers 
Video  recording  and  observation 


Figure  1 .  An  optical  motion  analysis  system  with  multiple  cam¬ 
eras  mounted  on  truss  system  tracks  a  set  of  reflective  markers 
attached  to  the  anatomical  landmarks  of  a  surgeon.  The  data  is 
used  to  reconstruct  the  surgeon  s  body  movement,  which  is  then 
visualized  as  a  stick  diagram. 

abduction/adduction,  and  internal/external  rotation) 
and,  for  posture  analysis  specifically,  the  center  of 
mass  (CoM)  can  be  calculated  from  combined  kine¬ 
matic  and  anthropometric  data  derived  from  subject 
measurement.72 

The  simplest  system  setup  for  video  analysis 
involves  a  single  camcorder  or  digital  camera  to  mon¬ 
itor  the  subjects  movement  in  1  plane,  for  example, 
the  sagittal  plane.  In  surgical  ergonomic  studies,  the 
endoscopic  image  has  often  been  used  to  monitor 
performance  accuracy  and  laparoscopic  instrument 
movements.  Video  recordings  captured  by  one  or  two 
cameras  at  different  angles  also  provide  approximate 
kinematic  data  analysis  of  surgeons’  body  movements. 

Comparison  of  Motion  Analysis  Systems 

By  far  the  most  economical  motion  analysis  method 
in  comparison  with  either  electromagnetic  or  optical 


motion  capture  is  the  orientation  sensor  system.  A 
significant  limitation  is  that,  although  this  system 
allows  obtainment  of  absolute  orientation  informa¬ 
tion  regarding  each  body  segment  to  which  an  orien¬ 
tation  sensor  is  attached,  it  does  not  record  the 
locations  of  individual  body  segments;  thus,  full- 
joint  kinematic  analysis  is  very  difficult. 

The  best  overall  accuracy  and  resolution  is  pro¬ 
vided  by  an  optical  motion  analysis  system  using  dig¬ 
ital  cameras.  With  these  systems,  however,  blocking 
problems  are  common,  making  optical  motion  track¬ 
ing  especially  problematic  for  use  in  an  actual  OR, 
where  obstacles  located  between  markers  and  motion 
cameras  may  hinder  or  obscure  detection  of  reflective 
markers.  Another  limitation  associated  with  this  sys¬ 
tem  is  that  “ghost  markers,”  or  interfering  noises, 
occur  as  a  result  of  reflections  from  the  metallic  sur¬ 
faces  of  surgical  instruments  or  devices. 

Electromagnetic  tracking  systems,  although  not 
plagued  by  the  blocking  issue,  are  bedeviled  by  data 
collection  interference  that,  particularly  in  the  OR 
theater,  occurs  as  a  result  of  other  sources  in  the  same 
field  generating  electromagnetic  fields  or  as  a  result 
of  metallic  objects.84  Moreover,  their  measurement 
volume  is  not  spacious.  Both  issues  result  in  meas¬ 
urement  accuracy  being  significantly  decreased  when 
a  sensor  is  not  within  transmitter  range. 

Video  analysis  is  the  simplest  of  such  systems.45,62 
Although  it  has  been  used  in  ergonomic  studies  for 
accuracy  assessment  of  task  completion  and  move¬ 
ment  tracking  of  laparoscopic  instruments,  the  out¬ 
comes  obtained  by  such  video  analysis  are  considered 
very  subjective  and  qualitative.  Thus,  this  system 
should  not  be  used  in  a  stand-alone  manner,  but 
only  in  conjunction  with  a  quantitative  motion  capture 
system  so  that  the  two  sets  of  data  can  be  time  syn¬ 
chronized  for  accurate  assessment  of  body  or  surgi¬ 
cal  instrument  motion.35,58,83,86 
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Table  2.  Applications  in  Surgical  Ergonomics  Using 
Kinematic  Data  from  Motion  Analysis  Systems 

Assessments 

Physical  workload  and  fatigue31,54 
Effect  of  operating  room  components: 

Display  systems32,33,59 
Instrument  handle  design38,40,44,49 
Instrument,  task  and  scope  location/orientation36,47,48 
Operating  table  height29,31 
Skill,  dexterity,  efficiency35,44,73-80,82,85,88 
Others45,62 


Motion  Data  Analysis 

Kinematic  data,  the  measurement  outcome  of  motion 
analysis  systems,  are  helpful  in  addressing  a  variety  of 
surgical  ergonomic  issues.  Table  2  shows  important 
implications  of  motion  analysis  data.  Berguer,  Smith, 
and  colleague  used  motion  analysis  in  addition  to 
EMG  for  measurement  of  upper  arm  elevation  level 
to  calculate  physical  effortAvorkload.31,54  The  effects 
on  performance  and  body  movements  of  changing 
variable  elements  in  a  variety  of  OR  components — 
including  display,  instruments,  and  operating  table — 
have  been  investigated.  Attempts  to  determine  what 
constitutes  an  optimal  laparoscopic  display  system 
have  been  marked  by  obtaining  and  analyzing  data, 
predominantly  objective  but  also  subjective  (eg,  self- 
reports),  on  head  rotation,  instrument  tracking,  and 
task  accuracy  in  relation  to  a  considerable  array  of 
display  types,  heights,  and  locations.32,33’59  One  study 
defined  optimal  table  height  as  necessary  for  surgeons 
to  achieve  optimal  task  performance  with  minimal 
shoulder  discomfort  and  workload.  It  suggested  that 
the  optimal  table  height  could  be  easily  found  by 
having  the  instrument  handle  as  held  in  the  sur¬ 
geon  s  hand  and  the  surgeon  s  elbow  at  the  same 
level.31  Another  study  found  optimal  operating  sur¬ 
face  height  to  be  between  a  factor  0.7  or  0.8  of  elbow 
height.29  Several  studies  have  investigated  how 
laparoscopic  instrument  handles,  believed  to  be  a  pri¬ 
mary  cause  contributing  to  unsafe  ergonomics,  can  be 
redesigned  to  improve  surgical  performance  and 

•  38  40  44  49 

ergonomics.  ’’ 

Cuschieri  and  colleagues  have  characterized  joint 
movement  in  maximum  movement,  minimum  move¬ 
ment,  and  range  of  motion  (ROM),  in  addition  to 
angular  velocity.  They  have  used  these  kinematic 
measures  in  their  investigations  on  the  effects  of 
instrument,  endoscope,  and  surgical  task  location  and 
orientation.  In  doing  so  they  have  found  appropriate 


intracorporeal  and  extracorporeal  length  ratios  and 
alternative  endoscope  angles  and  been  able  to 
postulate  on  what  constitutes  proper  orientation  of 
task  targets.36,47,48 

Motion  analysis  data  has  also  been  used  to 
describe  skill  and  dexterity  levels.  Joint  kinematic 
analysis  can  characterize  surgical  movement  pat¬ 
terns  used  by  expert  surgeons,35,44,88  compare  differ¬ 
ent  surgical  techniques,75,78  and  assist  in  defining 
learning  curves.85  Darzi  and  colleagues  defined 
motion  efficiency  by  studying  what  was  required  for 
the  hand  to  accomplish  its  objectives,  specifically  in 
terms  of  number  of  movements,  length  of  travel  path, 
and  measurements  of  performance  time.  Those  studies 
resulted  in  motion  efficiency  being  defined  as  the 
least  number  of  hand  movements  employed,  a  defini¬ 
tion  that  has  also  been  said  to  characterize  the  best 
kinematics  and  to  provide  the  best  quantifying  meas¬ 
urement  of  surgical  dexterity.74  77 

Motion  analysis  additionally  has  been  used  in 
robot-assisted  surgery  and  robotic  camera  control  to 
assess  the  efficiency  of  these  technologies.73,79,80,82 
In  other  studies,  assessment  of  motion  analysis  sys¬ 
tem  data  has  been  applied  to  accuracy  of  surgical 
outcomes  and  to  instrument  tracking.45,62 

Discussion 

New  developments  and  improvements  in  hardware, 
software,  and  research  variables  will  extend  the 
application  of  motion  analysis  in  surgical  ergonom¬ 
ics.  Advancements  in  computers,  digital  video 
equipment,  and  digital  technology  allow  multiple 
camcorders  to  be  synchronized  during  motion  capture, 
making  possible  3D  motion  analysis.  Although  these 
advanced  systems  are  capable  of  providing  kinematic 
analysis  in  addition  to  digital  imaging,  they  have  not 
yet  been  used  in  surgical  ergonomic  research.  The 
accuracy  of  research  will  be  greatly  improved  by  tech¬ 
nologies  in  magnetic  field  generation  and  sensor 
detection  that  are  being  developed  and  incorporated 
to  overcome  noise  interference  and  signal  attenuation. 

As  motion  analysis  becomes  a  more  common 
methodology  in  surgical  ergonomics  research,  the 
demand  will  be  that  such  systems  provide  more  sophis¬ 
ticated  data  that  will  allow  more  refined,  compre¬ 
hensive  assessment  of  issues  ranging  from  fatigue  to 
joint  control  strategies.  Currently,  traditional  research 
variables,  such  as  ROM,  which  characterizes  the 
absolute  difference  between  two  extremes  and  is  widely 
used  to  explain  joint  movement  range,  are  too  limited 
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Figure  2.  For  EMG  measurement,  surface  electrodes  with 
built-in  amplifiers  are  attached  to  target  muscles. 

to  offer  much  information.  One  limitation  of  ROM, 
for  instance,  is  that  even  a  few  extreme  outliers  may 
mislead  and  result  in  inaccurate  interpretation. 

Many  kinematic  variables  used  in  surgical 
ergonomic  studies,  such  as  mean  joint  angle  (MJA) 
or  maximum  angular  velocity  (MaxV),  are  calculated 
over  a  certain  period  of  performance.  Although  these 
variables  have  been  successfully  used  in  surgical 
ergonomic  analysis,  the  single  values  they  provide 
cannot  be  used  to  explain  details  such  as  how  or 
when  joint  angles  change  significantly  or  whether  a 
specific  pattern  of  joint  movement  exists.  For  more 
useful,  comprehensive  characterization  of  surgical 
joint  movements,  more  research  variables  and  analysis 
approaches  will  have  to  be  developed  to  fill  the  gaps 
that  traditional  variables  cannot  explain.  For  exam¬ 
ple,  Lee  et  al88  used  MJA  and  mean  joint  movement 
amplitude  (MJMA),  calculated  from  MJA  and  stan¬ 
dard  deviation,  to  study  the  joint  movement  range 
exhibited  by  very  experienced  and  less  experienced 
surgeons  performing  a  pegboard  transfer  task.  In 
another  study,  a  pegboard  transfer  task  was  parti¬ 
tioned  by  subfunctions  into  several  subtasks, 
and  joint  kinematics  were  investigated  within  those 
individual  subtasks.91 

As  more  sophisticated,  comprehensive  data  sets 
emerge  from  motion  analysis  studies  to  give  depth 
and  definition  to  research,  revisiting  early  surgical 
ergonomic  study  variables  should  become  more 
commonplace.  For  example,  motion  efficiency  has 
been  defined  by  the  number  of  hand  movements 
used  to  complete  a  task.  The  relationship  between 
motion  efficiency  and  manual  dexterity,  however, 
is  proving  to  be  more  complex  than  first  thought 


(consider,  for  instance,  that  the  instrument  maneu- 
verings  by  highly  skilled  surgeons  might  well  strate¬ 
gically  involve  many  hand  movements)  and  certainly 
merits  further  consideration. 

Electromyography 

With  laparoscopic  procedures  ever  more  prevalently 
used,  MIS  practitioners  are  reporting  different  patterns 
and  types  of  fatigue  and  discomfort.  The  most  common 
discomforts  reported  are  located  in  the  arms,  neck, 
and  upper  back.  EMG  is  one  of  the  best  objective 
tools  for  studying  the  mechanisms  underlying  the 
muscular  discomfort  or  fatigue  reported  by  laparo¬ 
scopic  surgeons,  as  EMG  measures  and  evaluates  the 
electrical  activity  of  muscles  in  action  and  at  rest. 
Ergonomists  have  long  been  acquainted  with  the 
many  different  ways  EMG  can  be  used  to  quantify 
physical  workload  and  muscular  discomfort.92 

Technology 

There  are  two  types  of  electrodes  used  for  EMG 
measurement:  fine  wire  and  surface.  For  evaluation 
with  higher  reliability  of  muscle  fatigue  and  force,  it 
has  been  proven  that  surface  EMG  is  superior  to  fine- 
wire  EMG.93’94 

A  fine-wire  electrode  made  of  thin  and  flexible 
metal  is  directly  introduced  into  the  target  muscle  to 
monitor  the  activation  pattern  of  a  single  motor  unit. 
Using  a  fine-wire  electrode,  it  is  possible  to  pick  up 
an  electrical  signal  without  having  attenuation  caused 
by  resistance  of  skin  and  tissues.  The  major  limitations 
of  the  fine-wire  electrode  are  its  inability  to  measure 
the  activity  of  a  muscle  as  a  whole  and  its  invasive, 
often  pain-causing  nature. 

Surface  EMG  is  a  simple,  noninvasive  way  to 
evaluate  the  superficial  muscles*  activities  (Figure  2). 
Though  placement  of  the  electrodes  is  relatively 
quick,  painless,  and  sanitary,  care  must  be  taken  to 
set  them  so  as  to  minimize  the  effect  of  noise  signals 
coming  from  more  superficial  or  nearby  muscles. 
Since  skin  and  underlying  tissues  between  the  elec¬ 
trode  and  the  target  muscles  serve  as  electrical 
resistance,  the  surface  EMG  provides  only  relative 
amplitude  information,  in  contrast  to  the  absolute 
values  provided  by  fine-wire  EMG.  However,  surface 
electrodes,  owing  to  their  considerable  advantages, 
are  dominant  in  surgical  ergonomic  studies  for  meas¬ 
urement  of  physical  workload  and  upper  body  fatigue. 
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Electromyography  Assessment  Variables 

EMG  data  provide  fundamental  information  about 
amplitude  (how  much)  and  timing  (when)  of  muscle 
activation.  Analysis  of  EMG  data  allows  characteri¬ 
zation  of  the  relationship  between  muscle  activation 
and  its  outcomes  (joint  movement,  force,  or  torque), 
as  well  as  estimation  of  muscle  fatigue  level. 

Currently,  EMG  data  analysis  is  performed  using 
one  of  two  methods.  The  more  basic  measurement 
is  a  simple  comparison  of  average  or  peak  ampli¬ 
tudes  of  EMG  signals  over  a  certain  period  of  time.95 
Such  a  basic  comparison,  although  useful,  cannot  be 
used  to  compare  activation  amplitudes  from  different 
muscle  groups  used  by  an  individual  subject  or  the 
same  muscle  group  used  by  a  number  of  subjects. 
However,  percentage  of  maximum  voluntary  contrac¬ 
tion  (%MVC)  can  be  obtained  by  calculating  the  per¬ 
centage  ratio  of  measured  EMG  amplitude  to  a 
reference  value,  most  commonly  MVC.  These  nor¬ 
malized  numbers  can  then  be  used  for  comparisons. 

Frequency  analysis  is  a  third  measurement  used 
for  muscle  fatigue  level  assessment,  a  specific  element 
of  muscle  activation.  Muscle  fatigue  is  perhaps  the 
most  significant  ergonomic  risk  factor  for  surgeons, 
but  use  of  amplitude  and  %MVC  only  indirectly 
infers  its  measurement.96 

Electromyography  Data  Analysis 

Amplitude  analysis  of  EMG  signals  has  been  used  in 
several  studies  in  surgical  ergonomics. 18,3 1324497 
Berguer  and  colleagues  compared  EMG  amplitudes  to 
posit  the  optimal  working  angle  for  a  laparoscopic 
instrument,18  optimal  ergonomic  table  height,31  and 
proper  monitor  height.32  Uchal  et  al44  reported  that  no 
muscle  activation  differences  occurred  in  a  compari¬ 
son  of  laparoscopic  instrument  handles,  in-line  vs  pistol 
grip.  Maithel  et  al97  compared  head-mounted  and 
traditional  video  displays  during  simulated  laparo¬ 
scopic  procedures  and  used  EMG  analysis  to  compare 
which  display  system  caused  more  muscle  fatigue. 

Other  studies  used  %MVC  analysis.19’21’28’29’41’98’99 
Berguer  and  colleagues  have  shown  that  laparoscopic 
technique  requires  more  physical  effort  than  open 
surgical  technique.41’98  They  also  used  the  same 
analysis  to  compare  different  instrument  grips.19 
Matern  and  colleagues  have  compared  different 
monitor  positions99  and  instrument  handle  designs.21 
Determination  of  optimal  operating  surface  height 
has  been  another  application.29  Fatigue  responses  in 


several  muscle  groups  have  also  been  studied,  with 
emphasis  on  effects  of  different  monitor  positions 
and  levels  of  surgical  experience.28 

There  are  studies  that  have  attempted  to  add  other 
measurements  to  %MVC  normalized  EMG  data.  For 
example,  Quick  et  al27  have  defined  relative  time  of 
activation  (RAT)  as  the  percentage  of  time  duration  at 
which  %MVC  of  each  muscle  group  is  10%  or  higher. 
RATs  were  then  calculated  from  individual  muscle 
groups  during  different  tasks,  which  proved  useful  to 
explain  each  muscle  s  activation  timing  during  a  specific 
task  and  to  provide  information  for  assessing  muscle 
specific  overuse.  Jonsson  has  introduced  the  concept 
of  amplitude  probability  distribution  function  (APDF) 
to  evaluate  the  distribution  of  muscle  contractions 
over  a  period  of  time.  This  approach  identifies  the  per¬ 
centage  of  time  that  muscle  activity  is  below  a  preset 
proportion  of  MVC.100101  Jonsson  also  proposed  the 
10th  percentile  data  that  is  commonly  known  as 
the  static  load  level  and  has  been  used  in  surgical 
ergonomics  to  serve  as  a  threshold  in  defining  contin¬ 
uous  work  and  minimal  risk  muscular  load.28 

Additionally,  total  physical  workload  as  an  assess¬ 
ment  of  fatigue  has  been  obtained  by  calculating  the 
time  integral  (that  is,  the  area  under  the  signal)  of 
EMG  over  a  specified  period.98  This  approach,  how¬ 
ever,  is  appropriate  only  when  each  individual  task  is 
given  a  set  time  frame;  it  is  not  appropriate  for  eval¬ 
uation  of  tasks  unconstrained  by  time. 

Discussion 

Currently,  the  majority  of  EMG  studies  in  surgical 
ergonomics  investigate  muscular  workload  during 
laparoscopic  tasks  through  analysis  of  signal  ampli¬ 
tude  or  %MVC.  As  muscular  fatigue  is  a  crucial  vari¬ 
able  in  ergonomic  risk  analysis,  it  requires  a  more 
specific,  accurate  measurement  tool.  Frequency 
analysis,  commonly  considered  the  gold  standard 
for  studying  muscular  fatigue  in  biomechanic 
and  neuroscience  research,102  103  has  currently  been 
used  in  only  a  few  surgical  ergonomic  studies.49104 
Frequency  analysis  should  be  used  more  intensively 
in  our  field. 

Additionally,  since  the  factor  of  muscle  fatigue  is 
so  important,  research  tools  must  be  developed  that 
are  capable  of  providing  more  detailed  information 
in  regard  to  issues  that  include:  identification  of 
muscular  fatigue  initiation,  quantification  of  fatigue 
progress,  and  assessment  of  potential  ergonomic 
risks  of  extreme  fatigue.  Also  crucial  is  the  need  to 
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Figure  3.  A  surgeon  performs  a  task  with  each  foot  on  a 
single  force  plate.  Data  collected  from  each  force  plate  is  used 
for  postural  analysis. 

determine  risk  levels  within  different  muscle  groups. 
This  determination  would  be  accomplished  by  estab¬ 
lishing  and  considering  those  anatomical  and  geomet¬ 
ric  differences,  such  as  type,  size,  length,  strength, 
and  frequency  of  use,  that  exist  between  and  among 
muscles.  If  such  detailed  data  could  be  generated 
and  studied,  ergonomics  research  could  determine 
quantification  and  normalization  through  intermuscle 
comparisons,  thereby  acquiring  vital  knowledge  about 
the  vulnerability  of  individual  muscles. 

Force  Plate  Systems 

Force  plate — also  called  force  platform — systems  are 
a  staple  for  scientific  measurement  of  body  posture 
within  many  motion  analysis  laboratories,  yet  they  have 
seen  limited  use  in  surgical  ergonomics  (Figure  3). 

Because  the  geometry  of  the  human  body  consists 
of  bigger  and  heavier  masses  at  the  upper  body  with 
supports  at  the  lower  body,  posture  during  standing 
often  has  been  portrayed  as  an  inverted  pendulum.105 
This  is  because  even  though  quiet  standing  appears  to 
be  static,  such  posture  actually  relies  on  dynamics. 
The  CoM  is  located  at  a  short  distance  in  front  of  the 
ankle  joint,  and  therefore  gravity  alone  can  cause  the 
body  to  topple  forward.  Our  bodies  are  effectively 
engineered  to  keep  us  from  falling  face  forward  by 
two  calf  muscles — the  soleus  and  the  gastrocnemius — 
that  actively  compensate  for  gravity’s  effect  and  allow 
us  to  maintain  proper  static  posture.106 

Single  or  multiple  force  plate  systems  have  been 
used  to  investigate  both  sway  and  balance  control 


during  quiet  standing,107  108  perturbed  standing,109 
and  functional  standing  or  walking.110  111 

Technology 

A  force  plate  provides  valuable  information  about 
ground  reaction  force  (GRF),  which  is  equal  in  inten¬ 
sity  and  opposite  in  direction  to  that  force  exerted  by 
the  foot  of  the  weight-bearing  limb.  The  force  plate 
usually  has  a  rigid  top  plate  made  of  a  large  piece  of 
metal  or  glass.  Force  sensors  at  each  of  its  4  support¬ 
ing  corners  produce  an  electric  output  proportional 
to  the  force  applied  to  the  upper  surface  and  the 
contact  point  location. 

Force  sensors  are  either  piezoelectric  or  strain 
gauge.  Piezoelectric  systems  use  quartz  transducers 
to  produce  electricity  that  is  proportional  to  the  level 
of  pressing.  These  systems  require  special  cabling 
and  charge  amplifiers  and  generally  have  greater  sensi¬ 
tivity  and  force  measurement  range  than  strain-gauge 
systems.  However,  piezoelectric  systems  may  not  be 
used  for  studies  that  involve  prolonged  standing, 
since  the  electrical  charge  attenuates  over  time.  A 
strain-gauge  system  needs  no  special  cables  or  ampli¬ 
fiers,  and  its  electric  signal  does  not  decrease  over  time. 

Use  of  a  force  platform  makes  obtainable  the 
3  components  (vertical,  lateral,  and  fore-aft)  of  force, 
the  two  coordinates  of  the  center  of  pressure  (CoP), 
and  the  rotational  forces  (moments)  about  the  x,  y, 
and  z  axis.  The  location  of  the  CoP  on  a  2D  surface 
has  been  widely  used  in  postural  analysis. 

Force  Plate  Data 

The  importance  of  good  posture  for  surgeons,  as 
well  as  for  other  medical  staff,  is  well  docu¬ 
mented.112,113  Upper  body  postures,  including  head, 
arm,  and  trunk  movements,  have  been  analyzed  in 
several  studies31'33,56,59,72,84  in  attempts  to  define  what 
constitutes  optimal  surgical  posture,  yet  few  studies 
in  surgical  ergonomics  have  used  balancing  infor¬ 
mation  available  from  force  plate  systems  alone. 

Berguer  et  al114  used  force  plate  data  to  compare 
surgeons’  postures  during  laparoscopic  and  open  sur¬ 
gical  procedures.  Using  a  single  force  plate  system  and 
analyzing  the  locations  and  ROM  of  CoP  in  two  direc¬ 
tions  (anterior-posterior  and  medial-lateral),  the  study 
concluded  that  during  laparoscopic  procedures,  surgi¬ 
cal  posture  was  less  dynamic,  as  was  shown  by  signif¬ 
icantly  reduced  ROM  of  CoP.  Using  primarily  FLS 
tasks,  Park  and  colleagues  have  studied  changes  in 
surgeons’  postures  during  task  performance.72,87,89,115 
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In  the  first  study,  they  found  that  when  CoP  excur¬ 
sions  were  defined  by  outer  boundaries,  a  significant 
increase  in  movements  could  be  correlated  to  task  dif¬ 
ficulty.87  Another  study  analyzed  the  postural  control 
strategies  used  by  surgeons  of  differing  experience  lev¬ 
els  in  the  performance  of  3  different  FLS  tasks.  The 
more  experienced  surgeons  used  unique  postural  con¬ 
trols  during  each  task,  and  their  strategies  differed 
from  those  used  by  less  experienced  surgeons.89  To 
demonstrate  that  postural  instability  does  not  neces¬ 
sarily  correlate  to  poor  performance,  another  study 
used  CoM  as  a  measurement.  A  much-experienced 
surgeon  with  a  wrist  complication  made  compensatory 
arm  movements  to  minimize  wrist  flexion  and  exhib¬ 
ited  strategic  movements  that,  although  they  appeared 
to  signal  postural  instability,  actually  proved  necessary 
to  achieve  successful  task  performance.72  For  a  more 
systematic  assessment,  Lee  and  Park  used  principal 
component  analysis  (PCA)  for  construction  of  an 
ellipse  that  covered  95%  of  CoP  excursions  to  charac¬ 
terize  sway  areas,  directions,  and  shape.115  In  that 
same  study,  a  new  variable  was  termed  and  defined: 
postural  stability  demand  (PSD),  a  calculation  of  the 
absolute  distance  between  CoP  and  CoM.  Through 
this  rarely  used  measurement,  it  was  discovered  that 
during  performance  of  each  of  3  FLS  tasks,  less  expe¬ 
rienced  surgeons  exhibited  high  PSD,  which  implied 
higher  postural  instability. 

Discussion 

Maintaining  good  posture  is  absolutely  necessary 
for  top  surgical  performance.  The  balancing  activity 
measured  by  force  plate  systems  should  be  regarded 
as  a  necessity  for  postural  analysis  undertaken  in 
surgical  ergonomics. 

Alone,  the  data  obtained  from  force  plates  are 
useful.  Ultimately,  however,  such  data,  if  combined 
with  motion  analysis  data,  will  allow  calculations  by 
which  more  detailed  mechanical  assessments  may 
be  acquired.  Then,  by  applying  biomechanical  models 
(mathematical  and/or  analytical  algorithms)  to  that 
more  robust  description,  crucial  information  regard¬ 
ing  the  underlying  etiology  of  the  extreme  muscular 
fatigue  and  physical  complications  often  experienced 
by  laparoscopic  surgeons  may  be  investigated. 

Conclusions 

We  have  intensively  reviewed  what  we  have  identified 
as  the  primary  research  components — that  is  tasks, 


models,  and  measurement  systems — that  form  the 
essential  methodology  of  MIS  ergonomics  as  this 
discipline  moves  through  its  initial  maturation  stages. 
In  doing  so,  we  have  addressed  elements  including, 
but  not  limited  to,  background,  technologies,  applica¬ 
tions,  drawbacks,  and  advantages.  In  closing,  we 
discuss  limitations  and  possibilities  we  see  inherent 
in  these  primary  research  components  and  proffer 
suggestions  that  if  put  into  practice  we  believe  will 
only  augment  the  credible  and  useful  findings  already 
comprising  surgical  ergonomic  research. 

The  quality  and  quantity  of  data  collected  are 
significantly  dependent  on  tasks,  models,  and  assess¬ 
ment  systems.  The  research  that  has  governed  surgical 
ergonomics  thus  far  has  primarily  addressed  specifics 
through  attempts  to  seek  and  obtain  solutions  that 
might  be  quickly  applied  to  minimize  the  immediate 
effects  of  immediate  problems.  Current  data  collec¬ 
tion,  although  useful,  is  limited  in  terms  of  what  it 
can  contribute  to  the  formation  of  standard  matrices. 

For  true  effectiveness  and  validity,  surgical 
ergonomic  studies  must  expand  in  the  near  future  to 
include  more  investigation  and  compilation  of  phys¬ 
ical  behavior  characteristically  engaged  in  by  expert 
surgeons,  including  joint  movement  and  postural 
control.  Such  research  will  promote  extrapolation 
and  description  of  characteristic  patterns.  Knowledge 
of  such  patterns  is  integral  to  the  formation  of  stan¬ 
dard  matrices  and  will  help  to  define  what  constitutes 
efficient,  effective,  and  ergonomically  safe  physical 
behavior.  Additionally,  the  creation  and  application 
of  more  objective  training  protocols  promise  to  be 
important  outcomes  served  by  the  development  of 
such  matrices. 

Typically  in  a  disciplines  infancy,  its  studies 
address  simple  problems  through  simple  constructs. 
For  instance,  it  now  appears  that  the  range  of  accept¬ 
able  movements  or  strategies  employed  by  surgeons 
during  task  and  procedure  performance  is  far  larger 
and  more  complex  than  envisioned  in  the  early  days 
of  surgical  ergonomic  research.  For  specific  illustra¬ 
tion,  early  studies  seemed  to  adequately  explain  that 
the  fewer  movements  one  used  to  complete  a  task 
(or  the  smaller  the  amount  of  overall  motion  involved 
in  task  execution)  was  indicative  of  what  constitutes 
surgical  maneuvering  at  its  most  effective  and  most 
efficient. 

As  our  field  becomes  more  established  and 
sophisticated,  however,  it  becomes  more  difficult  to 
accept  traditional  concepts  such  as  less  or  fewer  are 
best.  Discipline  maturation  requires  that  explanations 
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that  gave  meaning  as  research  was  conducted  initially 
be  subjected  to  scrutiny  Thus,  the  sanction  accorded 
the  concept  of  less  or  fewer  movements  is  necessar¬ 
ily  challenged  by  newer  surgical  ergonomic  research 
that  details  the  unique  patterns  of  movements  (some¬ 
times  many  and  sometimes  few)  creditable  to  each 
surgeon  trying  to  successfully  and  competently  achieve 
surgical  goals  while  maintaining  personal  body  com¬ 
fort  and  stability. 

Extended  physical  issues  could  well  produce  long¬ 
term  effects  resulting  in  surgeons  abandoning  what 
has  long  constituted  their  normal  movement  and 
replacing  that  with  alternative  compensatory  move¬ 
ments  in  an  effort  to  restore  the  originally  enjoyed 
level  of  surgical  movement.  No  matter  how  well- 
intentioned  such  compensatory  movements  might  be, 
they  may  unintentionally  cause  secondary  problems. 

Secondary  muscle  strategies,  for  instance,  employed 
as  a  result  of  compensatory  movements,  constitute 
unique  patterns  whose  explorations  should  be 
undertaken  in  future  surgical  ergonomic  studies.  As 
a  specific  example,  we  offer  that  one  muscle  group 
experiencing  extended  fatigue  might  be  expected  to 
give  over  to  another  muscle  group.  The  secondary 
group  of  muscles,  taking  over  for  the  primary  group, 
is  not  likely  to  have  been  trained,  a  situation  that 
could  well  incur  loss  of  performance  accuracy. 

Surgical  ergonomics  research  is  now  addressing 
issues  and  seeking  solutions  that  will  result  in  levels 
of  data  acquisition  that  are  still  rare.  Though  our 
review  indicates  a  paucity  of  data  acquired  within 
the  OR,  indications  are  that  the  challenges  of  safely 
and  unobtrusively  conducting  ergonomic  research  in 
that  environment  will  increasingly  be  undertaken. 
As  VR  models  become  more  realistic  and  compre¬ 
hensive,  simulation  in  the  laboratory  environment 
will  yield  more  sophisticated  information.  A  broad¬ 
ening  and  deepening  of  the  laparoscopic  tasks  stud¬ 
ied  (current  research  is  too  dependent  on  suturing) 
should  also  occur. 

Measurements  as  well  as  assessment  systems, 
often  used  across  disciplines  and  in  multiple  combi¬ 
nations,  but  now  only  marginally  considered  within 
our  discipline  (as  we  have  indicated  has  been  the 
case  with  force  plate/force  platform  systems),  will 
with  more  prevalent  use  permit  us  to  gather  more 
detailed,  robust  data.  And  there  is  no  doubt  that 
the  future  surely  holds  in  store  for  us  the  formation 
of  large,  cross-disciplinary  databases  in  which  the 
growing  amount  of  surgical  ergonomic  research  will 
be  stored  and  referenced. 


The  ideal  result  of  these  efforts  would  be  com¬ 
prehensive  characterization  of  surgical  movement  in 
relation  to  task  or  procedure.  This  result  would  then 
allow  exact  movements  (with,  presumably,  their 
associated  excellent  clinical  outcomes)  to  be  pat¬ 
terned  and  then  to  be  incorporated  into  the  teaching 
of  novice  surgeons.  Furthermore,  and  perhaps  more 
urgently,  such  knowledge  would  facilitate  better 
design  not  only  of  surgical  instruments  but  also  of 
the  surgical  workspace,  which  would  lessen  or  elim¬ 
inate  the  ergonomic  ravages  of  MIS  currently  asso¬ 
ciated  with  laparoscopic  surgical  performance. 
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Abstract 

Background  Given  the  physical  risks  associated  with 
performing  laparoscopic  surgery,  ergonomics  to  date  has 
focused  on  the  primary  minimally  invasive  surgeon.  Sim¬ 
ilar  studies  have  not  extended  to  other  operating  room  staff. 
Simulation  of  the  assistant’s  role  as  camera  holder  and 
retractor  during  a  Nissen  fundoplication  allowed  investi¬ 
gation  of  the  ergonomic  risks  involved  in  these  tasks. 
Methods  Seven  subjects  performed  camera  navigation 
and  retraction  tasks  using  a  box  trainer  on  an  operating 
room  table  that  simulated  an  adult  patient  in  low  lithotomy 
position.  Each  subject  stood  on  force  plates  at  the  simu¬ 
lated  patient’s  left  side.  A  laparoscope  was  introduced 
through  a  port  into  the  training  box  with  four  2-cm  circles 
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as  rear-panel  targets  located  in  relation  to  the  assistant  as 
distal  superior,  proximal  superior,  distal  inferior,  and 
proximal  inferior  target  effects.  The  subjects  held  the 
camera  with  their  left  hand,  pointing  it  at  a  target.  The  task 
was  to  match  the  target  to  a  circle  overlaid  on  the  monitor. 
Simultaneously,  a  grasper  in  the  right  hand  grasped  and 
pulled  a  panel-attached  band.  A  minute  signal  moved  the 
subject  to  the  next  target.  Each  trial  had  three  four-target 
repetitions  (phase  effect).  The  subjects  performed  two 
separate  trials:  one  while  holding  the  camera  from  the  top 
and  one  while  holding  it  from  the  bottom  (grip  effect).  A 
4x3x2  (target  x  phase  x  grip)  repeated-measures 
design  provided  statistics.  Dividing  the  left  force-plate 
vertical  ground  reaction  forces  (VGRF)  by  the  total  VGRF 
from  both  plates  provided  a  weight-loading  ratio  (WLR). 
Results  The  WLR  significantly  increased  ( p  <  0.005) 
with  proximal  targets  (2  by  80%  and  4  by  79%).  The  WLR 
decreased  75%,  74%,  and  71%  over  time.  No  difference 
existed  between  the  grip  strategies  (grip  effect,  p  >  0.5). 
Conclusions  A  high-risk  ergonomic  situation  is  created  by 
the  assistant’s  left  or  caudal  leg  disproportionately  bearing 
70-80%  of  body  weight  over  time.  A  distance  increase 
between  the  camera  head  location  and  the  camera  holder 
increases  ergonomic  risk.  The  phase  effect  was  interpreted 
as  a  compensatory  rebalancing  to  reduce  ergonomic  risk. 
Ergonomic  solutions  minimizing  ergonomic  risks  associ¬ 
ated  with  laparoscopic  assistance  should  be  considered. 

Keywords  Camera  assistant  •  Ergonomics  •  Force  plate  • 
Laparoscopic  assistance  •  Postural  analysis  •  Simulation 

It  is  well  known  that  maintaining  correct  posture  is  a  very 
important  ergonomic  factor  in  minimizing  physical  risks 
associated  with  the  performance  of  complex  tasks  [1-3]. 
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Awkward  working  posture  can  cause  increased  stress  for 
certain  body  parts,  resulting  in  fatigue,  musculoskeletal 
disorders,  and  nerve  problems  [4]. 

To  maintain  proper  balance,  the  human  body  requires 
continuous  active  control.  Because  two-thirds  of  body  mass 
ordinarily  is  in  the  top  two-thirds  of  a  person’s  height 
above  ground  level,  the  body  has  been  described  as  an 
unstable  balance  system,  and  the  standing  posture  often  is 
referred  to  as  an  inverted  pendulum  [5].  For  this  reason, 
posture  during  quiet  standing  actually  relies  on  dynamic 
control  although  the  posture  may  appear  to  be  static  [6]. 

Many  everyday  tasks  consist  primarily  of  static  posture 
while  the  upper  body  performs  more  dynamic  motions  (e.g. 
cashiers  at  registers,  secretaries  at  computers,  bus  drivers). 
Ergonomic  evaluations  and  assessments  have  been  under¬ 
taken  in  various  industrial  workplaces  to  address  the  levels 
of  postural  stress/discomfort  quantitatively  for  descriptions 
of  optimal  work  postures  [1,  7].  Increasing  numbers  of 
posture  studies  involving  health  care  workers  have  been 
conducted,  but  their  focus  has  been  primarily  on  low  back 
pain  problems  experienced  by  hospital  nurses. 

Ergonomic  stress  during  dynamic  tasks  such  as  patient 
lifting  and  transferring  is  only  part  of  a  broader  picture. 
Prolonged  static  posture  has  been  associated  also  with 
ergonomic  stress  [8].  The  physical  stress  associated  with 
the  fixed  work  posture  of  many  surgeons  and  operating 
room  staff  can  result  in  discomfort,  fatigue,  and  musculo¬ 
skeletal  disorders. 

Performing  a  laparoscopic  surgical  procedure  places 
particularly  high  physical  and  cognitive  demands  on  sur¬ 
geons  that  differ  substantively  and  dramatically  from  those 
experienced  during  open  surgery.  Knowledge  of  ergonom¬ 
ics  related  to  primary  laparoscopic  surgeons  has  been  well 
described  in  previous  studies.  Several  minimally  invasive 
surgery  (MIS)  components  including  long-shaft  instru¬ 
ments,  access  ports,  and  endoscopic  image  display  systems 
have  been  identified  as  contributing  to  ergonomically 
unfavorable  postures  assumed  and  maintained  by  laparo¬ 
scopic  surgeons  during  procedural  performances  [9-12]. 

Motion  analysis  and  force  plates  have  been  among  the 
tools  used  to  examine  surgeons’  body  movements,  specif¬ 
ically  the  measurement  and  analysis  of  postural  sway  [13]. 
Upper  body  movements  have  been  characterized  by  Person 
et  al.  [14],  who  demonstrated  the  feasibility  of  using  an 
optoelectric  measurement  system  for  automated  posture 
sampling  in  the  study  of  surgical  ergonomics. 

Berguer  et  al.  [15]  compared  surgeons’  postures  during 
laparoscopic  and  open  surgical  procedures  by  analyzing  the 
locations  and  range  of  motion  (ROM)  of  the  center  of 
pressure  (COP)  in  two  directions  (anteroposterior  and 
mediolateral).  They  concluded  that  during  laparoscopic 
procedures,  surgical  posture  is  less  dynamic,  as  shown  by 
significantly  reduced  ROM  of  COP. 
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Gillette  et  al.  [16]  calculated  the  outer  boundaries  to 
quantify  the  amount  of  sway  in  COP  excursions.  They 
found  that  significant  increases  in  movement  could  be 
correlated  with  task  difficulty.  Additionally,  postural  con¬ 
trol  strategies  used  by  surgeons  of  differing  experience 
levels  have  been  analyzed  during  fundamentals  of  laparo¬ 
scopic  surgery  task  performance.  These  studies  concluded 
that  each  task  required  unique  postural  control  mechanisms 
and  that  a  significant  difference  in  sway  control  was  evident 
among  surgeons  with  different  surgical  skill  levels  [17,  18]. 

For  a  more  systematic  assessment  of  postural  control, 
Lee  and  Park  [19]  characterized  sway  areas,  directions,  and 
shape  by  constructing  an  ellipse  using  principal  component 
analysis  that  covered  95%  of  COP  excursions.  In  that  same 
study,  postural  stability  demand  (PSD)  was  calculated  as 
the  absolute  distance  between  COP  and  the  center  of  mass. 
This  analysis  showed  that  less  experienced  surgeons 
exhibited  high  PSD,  which  implied  higher  postural  insta¬ 
bility  during  performance  of  laparoscopic  surgery  task 
fundamentals. 

Another  postural  study  using  center  of  mass  sway 
analysis  demonstrated  that  postural  instability  does  not 
necessarily  correlate  with  poor  performance  [20].  A  highly 
experienced  surgeon  with  a  wrist  condition  made  com¬ 
pensatory  arm  movements  to  minimize  wrist  flexion.  This 
surgeon  also  exhibited  strategic  movements  that  although 
appearing  to  signal  postural  instability,  actually  proved  to 
be  necessary  for  achieving  successful  task  performance. 

The  ergonomics  associated  with  the  operating  surgeon’s 
performance  are  understandably  a  priority.  Although  an 
active  part  of  the  entire  operation,  the  laparoscopic  assis¬ 
tant  has  not  until  currently  been  properly  the  subject  of 
ergonomics  studies. 

Nissen  fundoplication,  a  common  procedure,  involves 
both  advanced  and  basic  laparoscopic  skills.  The  inex¬ 
pensive,  easily  constructed  box  trainer  we  designed 
provides  a  simulation  alternative  to  the  animal  models  that 
to  date  have  dominated  studies  of  minimally  invasive 
fundoplication  task  performance. 

While  performing  a  fundoplication,  the  surgeon  stands 
centered  between  the  patient’s  legs  in  alignment  with  the 
diaphragmatic  hiatus.  This  position  keeps  the  surgeon 
facing  straight  ahead  in  as  neutral  a  posture  as  possible  and 
one  assumed  to  be  ergonomically  favorable.  In  contrast,  the 
camera  holder  is  positioned  to  accommodate  the  surgeon, 
which  usually  means  standing  to  one  side  of  the  patient  and 
thus  not  in  optimal  alignment  with  the  working  area.  To 
accommodate,  the  camera  holder  is  forced  to  rotate  his  or 
her  upper  body  while  simultaneously  reaching  across  the 
operative  field  to  hold  the  camera  or  retract  tissues  for  the 
surgeon. 

This  study  aimed  to  quantify  ergonomic  risks  associated 
with  the  tasks  performed  during  a  fundoplication  by  the 
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camera  assistant.  The  particular  segment  of  the  study 
reported  involves  postural  balancing.  We  hypothesized  that 
postural  balancing  is  affected  by  the  grip  used  to  hold  the 
camera  and  the  location  and  relocation  of  the  camera  head 
in  relation  to  different  targets.  We  also  investigated  how 
fatigue  influences  postural  balancing  over  time-framed 
(early,  middle,  late)  phases  of  task  performance. 


Materials  and  methods 

This  institutional  review  board-approved  study  was  per¬ 
formed  in  the  Surgical  Ergonomics  Laboratory  at  the 
Maryland  Advanced  Simulation,  Training,  Research,  and 
Innovation  (MASTRI)  Center  at  the  University  of  Mary¬ 
land.  Seven  right-handed  subjects  possessing  different 
levels  of  MIS  experience  ranging  from  medical  student  to 
fellow  from  the  Department  of  Surgery  at  the  University  of 
Maryland  School  of  Medicine  volunteered,  signed  an 
informed  consent,  and  agreed  to  perform  camera  navigat¬ 
ing  and  retracting  tasks. 

An  adult  patient  in  low  lithotomy  position  was  simulated 
by  a  training  box  on  an  operating  room  table.  An  extended 
arm  board  was  attached  to  the  table  bearing  the  trainer  box 
as  a  representation  of  the  obstruction  that  would  be  caused 
by  the  patient’s  left  leg.  Additionally,  to  simulate  the  sit¬ 
uation  of  the  assistant  working  around  the  arm  of  the 
operating  surgeon,  we  inserted  two  graspers  through  ports 
set  on  the  right  and  left  sides  of  the  camera  port  and 
instructed  each  subject  to  work  around  these  graspers.  At 
the  left  side  of  the  simulated  patient,  each  subject  stood  on 
two  force  plates  (Advanced  Mechanical  Technology  Inc., 
Watertown,  MA,  USA),  one  leg  on  each  (Figs.  1  and  2). 


Target  Board 


Fig.  1  Experimental  setup  for  the  simulation  of  the  camera  assis¬ 
tant’s  roles  as  a  camera  holder  and  retractor  during  a  Nissen 
fundoplication.  A  trainer  box  placed  on  the  operating  room  table 
simulates  a  patient  in  low  lithotomy  position.  A  board  with  targets, 
circles,  and  bands  is  attached  to  the  side  of  the  trainer  box  at  the 
location  of  the  simulated  patient’s  head.  While  performing  the  tasks, 
the  subject  stands  on  two  force  plates 


A  0°  scope  displayed  endoscope  images  on  a  standard 
liquid  crystal  display  (LCD)  monitor  positioned  at  eye 
level  at  the  head  of  the  bedside.  A  laparoscope  was  intro¬ 
duced  into  the  training  box,  which  also  contained  four  2- 
cm  circles  functioning  as  targets  placed  on  the  rear  panel  in 
the  following  relations  to  the  assistant:  distal  superior, 
proximal  superior,  distal  inferior,  and  proximal  inferior 
(target  effect).  Four  additional  2-cm  circles  approximately 
10  cm  apart  were  marked  in  a  straight  line  between  the 
superior  and  inferior  targets.  A  rubber  band  was  stapled  to 
each  of  the  outermost  circles  (A,  C)  for  the  purpose  of 
retraction  to  the  inner  circles  (B,  D)  (Fig.  3). 

Each  subject  performed  a  12-min  trial  that  consisted  of 
three  4-min  phases  (early-,  middle-,  and  late-phase  effect). 
In  each  phase  of  four  dual-task  modules,  the  subject  per¬ 
formed  camera-holding  and  -pointing  tasks  (moving  from 
target  1  through  target  4)  together  with  grasping  and 
retraction  tasks  (moving  from  A  to  B,  C  to  D,  A  to  B,  and  C 
to  D).  Each  subject  performed  two  trials,  one  while  holding 
the  camera  from  the  top  and  one  while  holding  it  from  the 
bottom  (grip  effect). 

Specifically,  the  camera-holding  and  -pointing  task 
required  holding  the  camera  with  the  left  hand  and  pointing 
it  at  a  target  (Fig.  4).  On  the  screen  were  two  circles 
(diameters  of  4.5  cm  for  the  larger  and  2.5  cm  for  the 
smaller)  printed  on  a  transparency  attached  to  the  monitor. 
The  task  for  the  subject  was  to  maintain  the  target’s 
accuracy  constraints  by  confining  it  between  the  bound¬ 
aries  of  both  circles.  The  grasping  and  retraction  task 
required  holding  a  grasper  in  the  right  hand  to  grasp  a 
rubber  band  at  one  location  and  retract  it  to  another.  Both 
tasks  were  always  executed  simultaneously. 

Once  both  retraction  and  camera  pointing  were  com¬ 
pleted,  the  subject  was  asked  to  maintain  accuracy 
constraints  of  both  the  grasping  and  camera-pointing  tasks 
for  a  minute.  While  doing  so,  each  subject  was  free  to 
change  or  adjust  his  or  her  posture. 

The  amplitudes  of  the  vertical  ground  reaction  forces 
(VGRF)  exerted  by  both  the  left  and  right  legs  onto  each 
force  plate  were  collected  and  recorded  at  200  Hz.  While 
each  subject  maintained  static  posture  for  each  of  the  four 
tasks  within  each  of  the  three  phases,  data  permitting  cal¬ 
culation  of  a  weight-loading  ratio  (WLR)  were  obtained. 
To  quantify  the  balancing  taking  place  between  the  two 
legs,  we  derived  the  WLR  by  dividing  the  left  force  plate 
VGRF  by  the  total  VGRF  from  both  plates. 


WLR  = 


Left  VGRF 

Left  VGRF  +  Right  VGRF 


x  100 


Thirty-nine  9. 5 -mm  retroreflective  sphere  markers  were 
attached  to  each  participant  according  to  Plug-in-Gait 
(ViconPeak,  Lake  Forest,  CA,  USA)  marker  placement 
designations.  The  markers  were  placed  on  bands  around  the 
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Fig.  2  Each  subject  while 
standing  on  two  force  plates 
performs  the  camera-pointing 
task  with  the  left  hand  and  the 
retraction  task  with  a  grasper  in 
the  right  hand.  Reflective 
markers  and  electromyographic 
electrodes  are  attached  to  the 
subject  for  motion  and  muscle 
activation  analysis 


Fig.  3  Inside  view  of  the  trainer  box  showing  the  target  board.  Four 
circles  numbered  1  to  4  are  the  targets  for  the  camera-pointing  task. 
Another  four  circles  are  labeled  A  to  D.  Circles  A  and  D  have  rubber 
bands  attached  for  the  tissue  retraction  task 


Fig.  4  Screen  shot  of  the  monitor  showing  a  target  confined  between 
the  two  circles  on  the  monitor 


head,  wrists,  and  knees;  a  custom-made  vest  and  waist  belt; 
bare  arms;  and  the  shoulder,  thighs,  and  shanks  of  the 
participants’  medical  scrubs.  Placement  of  elastic  bands  and 
a  vest  over  the  scrub  suit  of  each  participant  permitted  the 
markers  to  be  attached  securely  to  anatomic  landmarks.  A 
(ViconPeak,  Lake  Forest,  CA,  USA)  motion  capture  system 
consisting  of  12  high-speed,  high-resolution,  infrared 
digital  cameras  tracked  the  markers  and  reconstructed 
body  segment  movement  in  three-dimensional  (3D)  space. 
The  location  data  of  each  marker  were  sampled  at  100  Hz. 

Statistical  analysis 

An  overall  4x3x2  (target  x  phase  x  grip)  analysis  of 
variance  (ANOVA)  with  repeated  measures  was  applied  to 


characterize  the  extent  to  which  each  laparoscopic  assis¬ 
tant’s  body  weight  was  distributed  by  each  leg.  Then  the 
main  effects  of  these  three  factors  (target,  phase,  grip)  and 
their  interactions  were  analyzed.  The  significance  level  was 
set  at  a  p  value  of  0.05. 

Results 

The  data  on  the  two  grip  strategies  collected  separately 
showed  no  statistical  difference  between  them  in  terms  of 
weight  balancing  ( p  >  0.5).  Therefore,  the  data  were 
consolidated  for  further  analysis  using  4x3  (tar¬ 
get  x  phase)  repeated-measures  ANOVA.  The  significant 
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WLRs  in  relation  to  4  target  locations 


Distal  Proximal 
T arget  locations 

Fig.  5  Target  effect  as  it  relates  to  the  weight-loading  ratio  (WLR), 
which  significantly  increased  with  proximal  targets  (i.e.,  2,  4)  by 
11.8%  compared  with  distal  targets  (i.e.,  1,  3).  No  difference  was 
found  between  targets  on  the  same  side  (i.e.,  inferior,  3  and  4; 
superior,  1  and  2) 


WLRs  over  phases 

80 


Early  Middle  Late 

Phases 

Fig.  6  Phase  effect  on  weight-loading  ratio  (WLR),  which  decreased 
as  each  four-target  phase  of  task  performance  was  repeated 

main  result  was  that  the  WLR  was  found  to  be  different 
among  the  four  targets  ( p  <  0.005),  as  shown  in  Fig.  5. 
Post  hoc  tests  further  showed  that  the  WLR  significantly 
increased  with  proximal  targets  (i.e.,  2,  4)  by  11.8% 
compared  with  distal  targets  (i.e.,  1,  3).  No  difference  was 
found  between  targets  on  the  same  side  (inferior,  3  and  4; 
superior,  1  and  2).  The  findings  also  showed  that  WLR 
decreased  as  the  four  target  camera  navigation  modules 
were  repeated  (phase  effect,  p  <  0.05)  (Fig.  6).  The  inter¬ 
action  effect  between  the  target  and  phase  factors  was  not 
significant  (p  >  0.05). 

Discussion 

This  study  investigated  the  ergonomic  risks  potentially 
experienced  by  laparoscopic  assistants  during  a  simulated 
laparoscopic  Nissen  fundoplication.  Specifically,  it  ana¬ 
lyzed  the  postural  balancing  that  occurs  as  camera 
navigation  and  target  retraction  tasks  are  performed. 


Postural  balancing  analysis,  achieved  by  what  we  termed 
WLR,  using  a  force-plate  system,  demonstrated  that  the 
assistant’s  left  leg  disproportionately  bore  70%  to  80%  of 
body  weight  over  time,  thus  creating  a  high-risk  ergonomic 
situation.  It  can  reasonably  be  assumed  that  if  the  camera 
holder  stood  on  the  patient’s  right  side,  the  data  would  be  a 
mirror  image  of  that  presented  in  this  discussion  (i.e.,  more 
load  on  the  right  leg). 

Additionally,  after  introduction  of  the  camera  through 
the  one  access  port  used  in  our  study,  we  discovered  an 
ergonomic  risk  traceable  to  the  fulcrum  effect.  Specifically, 
when  the  camera  was  pointed  toward  the  proximal  target, 
the  camera  head  actually  moved  toward  the  distal  location. 
This  is  referred  to  as  the  target  effect.  The  risk  presented  in 
this  situation  is  that  that  the  camera  assistant  must  maintain 
continuous  extension  of  the  left  arm  while  simultaneously 
leaning  the  entire  body  left. 

When  we  considered  the  fatigue  effect  on  postural  bal¬ 
ancing  over  phases,  we  found  that  WLR  decreased  as  a 
phase  was  repeated  (early,  middle,  late).  This  is  referred  to 
as  the  phase  effect.  We  interpreted  this  WLR  result  as 
indicative  of  the  camera  assistant  acting  in  a  compensatory 
manner  to  combat  the  increased  muscular  fatigue  that 
developed  over  time.  Considering  that  this  compensatory 
action  resulted  in  a  reduction  of  ergonomic  stress  at  the 
joint  of  the  left  leg,  we  propose  ergonomic  solutions  such 
as  camera-handle  attachments  to  minimize  the  time- 
accrued  effects  of  the  extended  arm  and  unbalanced  lean¬ 
ing  posture.  Another  solution  to  minimize  fatigue  and 
maximize  postural  stability  may  be  to  train  the  camera 
assistant  in  a  simulated  situation,  as  done  in  our  study, 
before  actual  operating  room  performance  of  a  fundopli¬ 
cation,  with  specific  instructions  given  on  the  need  to 
rebalance  weight  as  accurate  camera  pointing  and  retrac¬ 
tion  are  repeatedly  achieved. 

To  mitigate  obstructions  that  potentially  create  posi¬ 
tioning  issues,  compensatory  action  also  is  probable  such 
as  equipping  the  operating  table  with  stirrups  or  having  the 
assistant  work  around  the  arms  of  the  operating  surgeon.  In 
this  study,  our  attention  to  such  obstructions  was  minimal. 

Ergonomic  studies  in  laparoscopic  surgery  have  focused 
on  understanding  and  improving  the  ergonomic  risks  and 
benefits  involved  in  the  primary  surgeon’s  performances 
[21].  In  a  recent  literature  review,  these  evidence-based 
experimental  ergonomic  studies  were  discussed  in  detail, 
including  examination  of  their  methodological  infrastruc¬ 
tures  (e.g.,  tasks,  models,  measurement  systems)  [13]. 
Additionally,  the  obtained  data  and  specific  applications 
covered  in  these  ergonomic  studies  were  summarized.  In 
brief,  the  review  investigated  studies  on  the  effects  that 
operating  room  components  such  as  display  systems  [22- 
24],  instrument  handle  designs  [25-28],  operating  table 
heights  [29,  30],  and  instrument,  scope,  and  task  locations/ 
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orientations  [31-33]  have  on  task  performance  in  addition 
to  the  physical  workload  of  primary  surgeons. 

The  studies  reviewed  also  quantitatively  assessed  levels 
of  primary  laparoscopic  surgeons’  skills,  dexterity,  and 
motion  efficiency  [27,  34,  35].  The  majority  of  models  and 
tasks  used  in  the  reviewed  studies  were  designed  to  simu¬ 
late  the  instrument  maneuvering  exercised  and  the  surgical 
tasks  performed  by  primary  surgeons  [29,  30,  36-39]. 

This  review  examining  the  current  body  of  ergonomic 
research  related  to  laparoscopic  surgery  showed  the  need  to 
extend  that  analysis  to  support  staff  who  have  an  active  role 
in  operating  room  procedures.  The  few  existing  studies  on 
the  task  performance  of  camera  assistants  had  goals  quite 
different  from  ergonomic  risk  investigation.  One  study 
investigated  the  application  of  an  electromagnetic  motion 
tracking  system  for  simple  measurements  of  the  move¬ 
ments  of  the  camera  and  holding  arm  [40].  Another  study 
compared  motion  efficiencies  measured  from  human 
camera  drivers  and  robot-assisted  camera  control  [41]. 

In  a  series  of  studies  focused  on  products  rather  than 
task  performance,  Van  Veelen  et  al.  [24]  investigated  the 
relation  of  display  locations  to  assistants’  neck  movements 
and  muscle  activation  during  camera  holding,  although  not 
comparatively  [29].  In  an  observational  survey  study,  Van 
Veelen  made  the  statement  that  positioning  of  the  camera 
may  be  associated  with  physical  discomfort  of  the  back 
[42]. 

The  strength  of  the  conclusions  we  reached  about  the 
hypotheses  proposed  in  this  study  compel  us  to  analyze 
further  our  data  collection  with  the  purpose  of  under¬ 
standing  the  association  of  specific  muscle  groups  to 
fatigue.  We  also  have  begun  to  develop  a  method  for  cal¬ 
culating  the  compressive  joint  reaction  forces  related  to 
physical  stress.  An  understanding  of  underlying  stress  and 
fatigue  mechanisms,  coupled  with  our  current  findings 
from  WLR  analysis,  promises  to  allow  for  further  identi¬ 
fication  and  minimization  of  evident  ergonomic  risks  in  the 
execution  of  camera-pointing  and  retraction  tasks  per¬ 
formed  by  laparoscopic  assistants. 
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Abstract — Traditional  minimally  invasive  surgeries  use  a  view 
port  provided  by  an  endoscope.  We  argue  that  a  useful  addition 
to  typical  endoscopic  imagery  would  be  global  3D  view  providing 
a  wider  field  of  view  with  explicit  depth  information  for  both  the 
exterior  and  interior  of  target  anatomy.  One  way  to  implement 
such  a  3D  view  is  to  extract  texture  images  from  an  endoscopic 
video  sequence  and  map  the  textures  onto  pre-built  3D  objects. 
This  paper  presents  a  novel  method  for  registering  endoscopic 
videos  to  a  3D  surface  model  derived  from  MR/CT.  Our  approach 
differs  from  previous  approaches,  because  no  camera  tracking  in¬ 
formation  is  required  a  priori.  We  takes  advantage  of  the  fact  that 
nearby  frames  within  a  video  sequence  usually  contain  enough 
coherence  to  allow  a  2D-2D  registration,  which  is  a  much  well- 
studied  problem.  The  texturing  process  can  be  bootstrapped  by 
an  initial  2D-3D  user-assisted  registration  of  the  first  video  frame 
followed  by  mostly-automatic  texturing  of  subsequent  frames. 
We  perform  experiments  on  phantom  and  real  data,  validate  the 
algorithm  against  the  ground  truth,  and  compare  it  with  the 
traditional  tracking  method  in  simulations.  Experiments  show 
that  our  method  improves  registration  performance  compared 
to  the  traditional  tracking  approach. 

Index  Terms — MIS,  Computer-augmented  Reality,  3D  Recon¬ 
struction,  3D  to  2D  Registration. 

I.  Introduction 

Minimally  invasive  surgery  (MIS)  has  become  a  major 
component  of  modem  surgical  practice.  Although  certain  risks 
exist,  MIS  causes  less  operative  trauma  for  the  patient  than  an 
equivalent  open  surgery.  Many  important  MIS  techniques  are 
realized  by  a  endoscopic  camera,  providing  an  interior  portal 
into  the  patient’s  body  through  which  a  surgeon  views  both 
anatomical  structures  and  surgical  instmments.  In  addition 
to  the  constrained  movement  of  the  surgical  instruments  and 
difficult  hand-eye  coordination  in  endoscopic  surgery,  a  sur¬ 
geon  is  faced  with  perceptual  challenges  such  as  the  apparent 
lack  of  3D  depth  information  and  the  narrow  field  of  view 
(FOV)  from  the  2D  camera.  Given  these  visual  limitations, 
endoscopic  surgery  often  takes  longer  than  its  counterpart  open 
surgery  ([1],  [2]).  Seasoned  surgeons  might  be  able  to  adjust 
themselves  to  this  challenging  display  environment;  however, 
this  accommodation  to  the  endoscopic  environment  requires 
increased  workload  and  lengthier  training  than  open  surgery 
([3],  [4],  [5]). 

One  important  goal  of  building  a  new  display  environment 
for  endoscopic  surgery  is  to  provide  enhanced  intra-operative 
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guidance  cues  for  surgeons.  These  cues  would,  ideally,  include 
depth  information  of  both  the  exterior  and  interior  of  the 
objects  and  a  wider  FOV,  which  should  ease  the  stress  on 
surgeons  associated  with  the  traditional  endoscopic  camera 
view  and  help  them  achieve  more  precise  and  reliable  surgical 
performance.  A  computer  augmented  environment  ([6],  [7])  is 
the  key  for  building  the  next  generation  endoscopic  surgical 
displays.  Such  an  environment  can  be  realized  by  combining 
3D  anatomical  models,  often  built  from  pre-operative  scans, 
and  the  realtime  2D  endoscopic  camera  view. 

One  way  to  combine  a  3D  anatomical  object  and  its 
corresponding  2D  endoscopic  view  is  to  use  general  3D-2D 
registration  techniques,  texturing  the  camera  video  sequence 
to  the  3D  surface  object.  The  challenge  is  determining  an 
accurate  mapping  between  coordinates  in  the  2D  and  3D  data 
sets.  There  has  been  an  increasing  demand  for  rendering, 
visualizing,  and  analyzing  the  object’s  geometry  and  its  surface 
texture  in  medical  applications  such  as  surgical  planning,  non- 
invasive  diagnosis  and  treatment,  and  image-guided  surgery. 
Traditionally,  if  geometry  and  texture  are  acquired  at  the  same 
time  with  the  same  sensor  ([8],  [9]),  then  the  alignment  of 
images  to  the  model  can  be  achieved  automatically  and  no 
further  3D-2D  registration  is  needed.  In  most  cases,  however, 
specialized  3D  scanners  are  used  to  acquire  the  precise  ge¬ 
ometry,  and  high  quality  digital  cameras  are  used  to  capture 
detailed  texture  information.  Thus,  the  images  have  to  be 
registered  with  the  3D  geometry  to  build  correspondences 
between  the  geometry  and  texture  information. 

This  texture-to-geometry  registration  problem  has  been  han¬ 
dled  by  pose  tracking  of  the  camera  ([10],  [11]).  With  precise 
instrumentation  and/or  fiducial  markers,  satisfactory  results 
have  been  obtained.  However,  there  are  certain  applications  in 
which  neither  of  these  requirements  can  be  satisfied.  In  MIS, 
endoscopes  that  are  introduced  through  a  small  port  into  a 
body  cavity  provide  high-resolution  video  images  of  a  limited 
view  of  the  surgical  site.  Since  the  FOV  of  the  endoscope 
is  small,  a  major  limitation  is  lack  of  enough  fiducial  marks 
between  consecutive  frames.  Besides,  the  endoscope  cannot 
be  directly  tracked  in  a  surgical  setting.  Instead,  a  tracker  is 
used  and  attached  to  the  outside  of  the  endoscope,  as  shown  in 
Figure  12.  The  long  offset  between  the  tracker  and  endoscope 
magnifies  the  tracking  error  (see  the  analysis  in  Appendix). 
The  error  is  about  5.5  pixels  for  images  with  the  resolution 
640  x  640  as  computed  in  the  example.  Another  approach 
for  3D-2D  registration  is  to  find  a  parameterization  of  a  3D 
surface  model  onto  a  2D  texture  domain,  while  maintaining 
certain  criteria,  such  as  the  minimization  of  distortions  and 
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compliant  with  user-defined  feature  point  locations  (e.g.,  [12], 
[13],  [14]).  While  these  methods  are  satisfactory  for  a  single 
image,  applying  the  manual  selection  of  feature  points  in  every 
video  frame  is  too  tedious. 

In  this  paper,  we  combine  2D  tracking  with 
parameterization-based  texture  mapping  without  camera 
tracking.  The  basic  idea  is  to  boot-strap  the  registration 
process  using  an  interactive  method  (such  as  the  one  from 
[12])  and  then  to  use  vision-based  2D  tracking  to  find  2D-2D 
sparse  correspondences  between  images.  Since  the  first  frame 
has  3D-2D  correspondences,  each  new  frame,  based  on  the 
overlapping  area  to  the  previous  frame,  can  be  “pasted”  onto 
the  3D  model  automatically  using  a  parameterization-based 
method.  In  this  way,  we  convert  the  difficult  3D-to-2D 
registration  problem  into  a  well- studied  and  understood 
problem  of  2D-to-2D  matching.  Our  method  is  very  general, 
and  it  can  be  used  for  images  that  can  not  be  modeled  by 
perspective  projection.  In  addition,  when  the  camera  is  indeed 
projective  and  the  object  is  (almost)  rigid,  we  introduce  a 
projective  correction  term  in  the  parameterization  process  so 
that  accurate  registration  can  be  achieved  over  a  wide  range 
of  viewpoint  changes.  Unlike  projective  texture  mapping,  this 
projective  correction  term  is  a  soft  constraint  so  our  method 
is  more  robust  against  errors  in  the  3D  model  or  even  slightly 
deformed  models.  In  both  cases,  we  avoid  the  problem  of 
camera  calibration  or  external  tracking. 

In  essence,  our  approach  combines  the  strength  of  both 
visual  tracking  (automatic)  and  parameterization-based  texture 
mapping  (more  flexible),  leading  to  a  new  means  to  quickly 
and  semi-automatically  add  textures  to  3D  models. 

In  the  remainder  of  this  paper,  we  review  related  work  in 
computer  augmented  surgery  and  recent  3D-2D  registration 
techniques  before  turning  to  a  detailed  description  of  the 
proposed  video  sequence  texture  mapping  algorithm.  We  then 
demonstrate  the  application  of  our  method  on  phantom  and 
real  data,  and  provide  an  evaluation  of  the  method  by  formally 
comparing  it  to  the  traditional  tracking  method. 

II.  Related  Work 

The  rich  literature  in  computer  augmented  surgery  includes 
methods  providing  better  perception  and  guidance  in  both  open 
and  MIS  surgeries.  The  use  of  preoperative  images  has  become 
a  standard  in  these  surgeries.  Among  the  pioneer  work,  [6] 
rendered  registered  stereo  views  from  endoscopic  cameras  to 
a  patient’s  exterior  anatomy,  to  augment  open  surgeries.  [15] 
presented  a  visualization  system  for  robot  assisted  coronary 
artery  bypass  graft  by  registering  a  preoperative  coronary  tree 
model  to  realtime  endoscopic  images  using  manually  picked 
landmarks.  [7]  projected  a  reconstructed  mesh  of  the  entire 
surgical  field  from  stereo  onto  3D  preoperative  images.  [16] 
registered  3D  surfaces  reconstructed  by  laser  range  scanner 
to  preoperative  CT  images  and  [17]  pushed  the  idea  one  step 
further  to  track  cortical  surface  deformation  via  serial  laser 
range  scans  and  non-rigid  image  registration.  [18]  approached 
real-time  fusion  of  endoscopic  views  with  dynamic  3D  car¬ 
diac  images  on  a  phantom  by  trying  to  solve  registration  of 
deformable  models.  Most  closely  to  our  proposed  method, 


[10]  showed  how  to  automatically  register  endoscopic  brain 
images  to  3D  brain  meshes,  which  will  be  further  detailed 
when  we  discuss  2D  and  3D  registration  methods,  which  are 
a  key  component  in  computer  augmented  surgery. 

Several  registration  techniques  of  a  2D  image  and  a  3D 
geometric  model  have  been  proposed  so  far.  The  first  approach 
estimates  the  pose  of  a  camera  through  corresponding  pairs  of 
2D  and  3D  point  features,  fiducial  landmarks.  The  accuracy  of 
the  pose  estimation  algorithm  is  determined  by  the  accuracy 
of  locating  the  2D  and  3D  features,  the  accuracy  of  finding 
correspondences,  and  the  number  of  features.  It  can  be  difficult 
to  locate  landmarks,  since  some  landmarks  maybe  hidden. 
Often,  many  features  are  needed  to  estimate  the  pose,  which 
are  not  always  available  in  medical  imaging.  Instead  of  directly 
searching  for  3D-2D  point  pairs,  some  registration  techniques 
use  reflectance  images.  These  approaches  first  extract  feature 
points  [19]  or  edges  [20]  from  reflectance  images  and  texture 
images,  then  apply  some  constraints  (e.g.  epipolar  constraints 
[20]  and  optical  flow  constraints  [21])  to  estimate  pose  in¬ 
formation.  These  methods  also  suffer  from  the  problem  of 
location  of  features. 

Some  other  registration  techniques  inspect  larger  image 
features  such  as  contour  lines  within  a  2D  image  and  a 
projected  image  of  3D  model.  With  the  correspondences  of  2D 
photometrical  contours  and  projected  3D  geometrical  contours 
on  the  2D  image,  the  camera  transformation  can  be  estimated 
by  minimizing  the  error,  which  can  be  defined  as  the  sum 
of  minimal  distance  between  the  3D  model  and  a  projection 
ray  ([22],  [23]),  or  the  sum  of  distances  between  points  on  a 
contour  of  an  image  and  on  a  projected  contour  line  of  a  3D 
model  ([24],  [25]).  However,  if  the  number  of  correspondences 
is  not  sufficient,  the  contour-based  approaches  are  likely  to 
converge  to  the  local  minimums  if  an  initial  alignment  error 
is  large. 

Once  an  initial  mapping  between  3D  and  2D  is  established, 
a  vision-based  tracking  technique  is  employed  to  improve  the 
robustness  of  the  registration.  It  first  extracts  features  using 
algorithms  such  as  the  Kanade-Lucas-Tomasi  (KLT)  ([26], 
[27])  and  the  Scale  Invariant  Feature  Transform  (SIFT)  ([28], 
[29]).  The  orientation  and  position  of  pose  are  estimated  by 
feature  correspondences.  Then,  the  3D-2D  projective  trans¬ 
form  is  applied  for  the  projective  texture  mapping  onto  the 
geometry.  Dey  et  al.  [10]  suggests  this  technique  for  the 
fusion  of  freehand  endoscopic  brain  images  to  the  surfaces. 
Methods  relying  solely  on  projective  mapping  require  that  the 
underlying  object  be  rigid,  which  may  not  be  true  for  the 
majority  of  organ  images. 

Minimizing  the  distortions  (e.g.  area,  length  or  angle  distor¬ 
tion),  or  stratifying  user-defined  positional  constraints,  param¬ 
eterization  avoids  the  process  of  determining  the  parameters 
of  the  cameras.  Levy  [12]  suggests  a  method  to  satisfy  the 
user-specified  constraints  in  the  least-squares  sense.  Desbrun 
et  al.  [13]  use  Lagrange  multipliers  to  incorporate  positional 
constraints  into  the  formulation  of  parameterization,  and  [30] 
use  Steiner  vertices  to  satisfy  the  constraints.  Matchmaker  [14] 
automatically  partitions  a  mesh  into  genus-0  patches,  and  satis¬ 
fies  the  use- specified  correspondences  between  the  patches  and 
one  or  two  texture  image(s).  Cross-parameterization  [31]  and 
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inter-surface  mapping  [32]  use  similar  approach  of  mapping 
between  two  surfaces,  rather  than  between  a  surface  and 
texture  space,  while  Zhou  et  al.  [33]  suggest  a  similar  approach 
between  a  surface  and  multiple  texture  images. 

III.  Methods 

This  paper  presents  a  new  technique  to  map  the  acquired 
endoscopic  video  (Figure  la)  onto  the  3D  object  model. 
Figure  1  illustrates  the  novel  texturing  processing  pipeline, 
which  includes  three  stages:  single  view  mapping,  panorama 
construction,  and  incremental  texturing  mapping. 

1)  single  view  mapping:  We  adopt  the  Least  Square  Con¬ 
formal  Mapping  (LSCM)  algorithm  [34],  where  a  user 
can  assign  correspondences  between  a  3D  model  and  a 
2D  texture  map  and  the  system  optimizes  a  mapping  that 
minimizes  distortions.  The  model  is  parameterized  (Fig¬ 
ure  lc)  over  the  first  frame  with  a  set  of  user- specified 
correspondences  (Figure  lb),  using  linear  system  solver 
(Figure  Id). 

2)  panorama  construction:  In  Figure  le,  the  current  frame 
taken  from  a  different  view  point  is  stitched  with  the  pre¬ 
vious  frames  into  a  single  large  image  that  is  continuous 
in  geometry  and  shading.  This  is  a  2D-2D  registration 
process. 

3)  incremental  texture  mapping:  This  is  the  heart  of 
our  algorithm,  as  show  in  Figure  If.  The  subsequent 
endoscopic  images  from  video  are  mapped  onto  the 
geometry  incrementally.  Our  algorithm  takes  advantage 
of  the  fact  that  nearby  frames  within  a  video  sequence 
usually  contain  enough  coherence  to  allow  a  2D-2D 
registration.  The  system  propagates  the  user  constraints 
defined  in  the  single  viewing  mapping  to  the  panorama. 


A.  Single  View  Mapping 

Parameterization  of  3D  surface  over  a  plane  is  to  establish 
a  one-to-one  mapping  between  the  planar  domain  and  the 
surface  domain,  while  the  metric  distortion  introduced  by 
mapping  has  to  be  kept  to  the  minimum.  To  obtain  visually 
pleasing  result,  the  feature  correspondences  have  to  be  en¬ 
forced  between  the  texture  and  the  3D  surface.  Our  algorithm 
combines  the  method  of  matching  features  [12]  and  the  method 
of  least  square  conformal  maps  [34] .  As  shown  in  the  dashed 
red  box  in  the  figure  1,  the  single  view  mapping  contains 
three  components:  3D-2D  feature  correspondences,  parame¬ 
terization  and  linear  system  solver.  To  define  the  mapping 
procedure,  we  introduce  the  following  notations: 

We  represent  the  model  with  a  triangle  mesh  r  = 
{ti  ,  72 , . . . ,  rq  }  with  vertices  Xi ,  x2 , . . . ,  xn ,  where  r*  denotes 
a  triangle  in  the  model.  Parameterizing  a  mesh  is  to  construct 
a  piecewise  linear  mapping  between  r  and  a  2D  texture  U. 
such  as: 

X-.T^U 
Xi  -►  U i 

where  x$  =  ( x^y^Zi )t  denotes  the  3D  position  of  the  node 
in  the  original  mesh  r,  is  the  2D  position  of  the 

corresponding  node  in  the  2D  texture  U. 


Fig.  1.  The  pipeline  of  our  system  for  texturing  endoscopic  images  to  3D 
surfaces,  (a)  images  extracted  from  the  endoscopic  video,  (b)  3D-2D  feature 
correspondences;  (c)  parameterization;  (d)  linear  system  solver;  (e)  panorama 
construction;  (f)  incremental  texturing  mapping;  (g)  the  result  of  the  single 
view  mapping;  (h)  the  intermediate  result  of  texture  mapping;  (i)  The  final 
result  of  texture  mapping. 


1 )  3D-2D  Feature  Correspondence:  Although  a  planar  pa¬ 
rameterization  of  the  mesh  is  widely  used  for  texture  mapping, 
it  can  not  satisfy  any  special  correspondence  between  the  mesh 
and  the  texture  ([34],  [35]).  The  algorithm  must  handle  user- 
defined  feature  correspondences,  while  maintaining  a  valid 
one-to-one  mapping  between  the  3D  mesh  and  2D  texture  im¬ 
ages  or  panoramas.  The  correspondences  may  be  represented 
by  providing  each  vertex  x*  with  its  image  through  mapping 
that  satisfies  [12]: 

=  Ai  •  Ujj  +  A2  •  ui2  +  A3  •  ui3  (1) 

x(ui)  =  Ai  •  xh  +  A2  •  xi2  +  A3  •  xi3  (2) 

where  (Ai,  A2,  A3)  are  the  barycentric  coordinates  at  x,  in  the 
triangle  (x^ ,  xi2 ,  xis },  computed  as  follows: 


Vl2  Vi  3  "^*3  *^2 

Vis  Hi  i  Xh  Xi3 

Vi\  Vi  2  Xj,2 


^i2 


XiiUi  2 


*^3  Vi 2 
xiiVi3 
Xi2Vi\ 


(3) 


where  ATi  =  (x^y^  -x^y^)  +  (xhyi3  - xi3yi2)  +  (x^y^  - 
xhyi3)  denotes  twice  the  area  of  the  triangle. 

Each  3D-2D  feature  correspondence  is  defined  to  (x7 .  u  7 ) . 
u j  is  the  desired  texture  coordinates  at  x7.  Thus,  we  can  define 
the  objective  functions  to  minimize  the  mapping  error  at  all 
the  m  constrained  features  points. 


c(u)  =  —  V'CxjOII2  (4) 

3  = 1 
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Recalled  from  Equation  1  ,  the  mapping  at  x?  is  a  linear 
combination  of  the  values  u ik  at  the  vertices  of  the  triangle  that 
contains  x.  Then,  C(u)  can  be  defined  as  follows,  by  replacing 
the  ip(xj). 

m  3  3 

cw  =  -  e  Afe  •  ujkf + -  e  v  •  (5) 

J  =  1  /c=l  /c  =  l 


2)  Parameterization:  Assigning  texture  coordinates  to  a  3D 
mesh  can  be  regarded  as  a  parameterization  of  the  surface. 
Consider  the  triangle  Ti  =  {xi,X2,X3},  with  the  texture 
coordinates  {111,112,113}  associated  with  its  vertices.  The  gra¬ 
dient  of  Ti  can  be  expressed  in  the  local  orthonormal  basis 
(xi,X,Y): 


(  > 

V  du/d 


x 

y 


0v/di 

dv/d„ 


f  V2-V3  2/3  -  2/1  2/1  -  2/2 

V  £3-^2  —  ^3  ^2  - 


U 1 

Vl 

U2 

V2 

U3 

V3 

The  basis  can  be  computed  as  follows: 


(6) 


X2  -  Xj  _  X  x  (x3  -  Xi)  X  X 
||x2  —  Xi  ||  ’  ||X  x  (x3  —  xi)  x  X || 


We  use  the  least  square  conformal  mapping,  which  is  a 
planar  quasi-conformal  parameterization  method  using  a  least- 
square  approximation  of  the  Cauchy-Riemann  equation  [34]. 
The  Cauchy-Riemann  equation  says  ip  is  conformal  on  tl  if 
and  only  if  the  following  equation 


dip  ,-dip_ 


(7) 


holds  true  on  each  triangle  of  rz.  This  conformal  condition 
in  general  cannot  be  strictly  satisfied,  so  the  minimization  of 
the  violation  of  this  condition  was  defined  in  the  least  square 
sense: 


CM  =  £ 

Ti£r 

=  E 


Tier 


dip  .dip 
dx  + 1  dy 


d ip>  .dip 
dx  Jr%dy 


dA 


d±  ,  -dj) 
dx  '  1  dy 


(8) 


is  a  constant 


where  the  last  equality  follows  since 
complex  number.  Ar.  denotes  the  area  of  the  triangle  r?>  Let 

Uj  =  Uj  +  ivj  and 


{Wi  =  (x3  -  x2)  +  i(y3  ~  Vi) 

W2  =  (xi  -  x3)  +  i(y 1  -  y3)  (9) 

W3  =  (x2  -  x i)  +  i(y2  -  i/i ) 


The  Cauchy-Riemann  equation  7  can  be  rewritten  as  follows 
using  the  complex  number: 


dip  dip 
dx  +  dy 


i 

a7< 


w1  \T  1 

<  ^ 

W2 

U2 

w,  7  \ 

K  u3 

0  (10) 


Thus,  the  objective  function  can  be  written  in  the  quadratic 
form  such  as 


C(t) 


E 

Tj£t 


Wjur,  ' 

Yl 

'  Un 

Wh,r3 

uj2 

Wj3,r3  , 

1  \ 

K  ^ 3 

2 


(ID 


where  ji,  72,73  are  the  vertices  index  of  triangle  t:). 

3)  Linear  System  Solver:  The  cost  function  C(M )  can  be 
defined  as  the  weigted  sum  of  the  equations  5  and  1 1 .  It  can 
be  therefore  written  as  follows: 


C{M)  =  C(u)+rcC(r) 

2  n 

=  E  (&fc  ~  E  ak,i  •  xi)2  =  \\A  ■  z  -  b||2  (12) 

k  i=  1 

where  the  vector  z  is  composed  of  all  unknowns  ( u^Vi ), 
defined  by  Zi  =  ui  and  Zi+%.  =  w  is  the  weight  set  to  10 
in  our  experiments.  We  use  the  conjugate  gradient  method  to 
solve  the  minimization  problem  of  the  least  square  in  equation 
12.  Algorithm  1  summaries  the  techniques  of  the  single  view 
texture  mapping  we  discussed  above. 


Algorithm  1:  Single  view  mapping 

1.  initialize  A  =  [],  and  b  =  0. 

2.  for  each  feature  correspondence  between  the  surface 
and  the  texture 

(a)  compute  Ai,A2,  and  A3  using  Eq.  2. 

(b)  add  one  row  in  the  sparse  matrix  A  and  and  set 
the  value  of  A i  in  this  row  and  b  using  Eq.5, 
based  on  the  index  of  u  and  v. 

3.  for  each  triangle  Ti  on  the  surface 

(a)  Compute  Wi,  W2,  and  W3  using  Eq.  6  and  Eq.  9. 

(b)  add  two  rows  corresponding  to  the  real  and 
imaginary  parts  of  Wi  in  the  sparse  matrix  A 
and  set  the  value  of  Wi  in  these  two  rows, 
based  on  the  index  of  u  and  v. 

4.  compute  the  z  (e.g,  texture  coordinates)  using  Eq.  12 
by  the  conjugate  gradient  method. 


B.  Panorama  Construction 

The  full  scene  cannot  be  always  captured  by  a  single  view. 
It  is  common  to  take  multiple  images  and  compose  them 
into  panorama.  There  are  two  main  techniques  of  panorama 
construction,  feature-based  approach  [36],  and  image-based 
approach  [37].  Feature-based  approach  has  lower  complexity 
of  the  matching  problem  than  image-based  approach,  since 
it  only  uses  a  few  salient  image  features,  such  as  comers, 
while  image-based  approach  attempts  to  match  bitmaps.  We 
use  the  feature-based  approach  similar  to  the  one  [36],  which 
is  insensitive  to  the  ordering,  orientation,  scale,  illumination  of 
the  images  and  image  noise.  We  briefly  outline  the  procedure 
here  (see  [36]  for  details). 
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1)  Extract  and  match  SIFT  features  between  all  pairs  of  the 
images. 

2)  Use  RANSAC  to  select  a  set  of  inliers  that  are  compat¬ 
ible  with  a  similarity  transformation  between  each  pair 
of  images 

3)  Verify  the  match  and  discard  all  feature  matches  which 
are  not  geometrically  consistent  with  the  transformation 
between  the  images. 

4)  Given  the  set  of  geometrically  consistent  matches  be¬ 
tween  the  images,  apply  bundle  adjustment  to  solve  for 
all  the  transformations  jointly. 

C.  Video  Sequence  Mapping 

We  propose  two  approaches  to  achieve  video  sequence  map¬ 
ping,  extrapolation  texture  mapping  and  incremental  texture 
mapping.  They  transfer  the  3D-2D  feature  correspondences 
that  are  defined  in  the  single  view  mapping  to  new  frames, 
using  the  2D  feature  matching  found  during  the  panorama 
reconstruction  stage. 

To  describe  the  mapping  procedure  conveniently,  we  intro¬ 
duce  the  following  notations: 

•  Tn:  the  time  of  adding  the  frame  n. 

•  Fn :  the  new  frame  needed  to  map  over  the  3D  surface  at 
Tu¬ 
rn  Pn\  the  panorama  constructed  by  Pn_i  and  Fn  at  Tn. 

•  MFn^pn_1:  the  set  of  matching  feature  points  between 
Pn- 1  and  Fn,  {(upn_l5j,  upn  j),  where  u pn_1j  is  one 
feature  point  in  Pn_i,  and  uFnj  is  the  corresponding 
feature  point  in  Fn}. 

•  Mpn_1^pn:  the  set  of  matching  feature  points  between 
Pn- 1  and  Pn,  {(upn_1j,uPnJ),  where  uPn_uj  is  one 
feature  point  in  Pn_i,  and  u pnj  is  the  corresponding 
feature  point  in  Pn}. 

The  problem  of  video  sequence  mapping  that  we  need  to 
solve  is:  given  a  3D  model  with  a  well-mapped  area  where 
we  have  the  3D-2D  correspondences  between  the  model  r 
and  the  texture  or  panorama  Pn-i  ,  we  need  to  map  Fn  onto 
the  model. 

Extrapolation  Texture  Mapping:  Figure  2  shows  the  pro¬ 
cess  of  this  approach.  For  the  first  frame  at  time  To  in  the  video 
sequence,  we  can  register  it  with  the  3D  model  interactively 
or  via  a  few  fiducial  marks.  At  time  Tn(n  >  0),  the  new 
panorama  Pn  is  constructed  with  Pn_i  and  Fn  using  the 
panorama  construction  algorithm  we  described  above.  Given 
the  3D-2D  correspondences  x  :  xi  uPn_i,i  (Figure  2a)  be¬ 
tween  Pn-i  and  the  model  r  and  the  2D-2D  correspondences 
Mpri_1^Pn  =  {(upn_1j,uPnj})  (Figure  2b)  between  Pn_  1 
and  Pn,  we  can  transfer  the  correspondences  to  x  :  xi  — >  uPn,j 
(Figure  2c)  between  the  model  and  Pn,  because  the  set  {up^. } 
belongs  to  the  set  {up.}.  Thus,  we  know  the  partially  3D- 
2D  feature  correspondences  between  Pn  and  the  model.  The 
objective  function  can  be  defined  to  minimize  the  mapping 
error  at  these  transferred  correspondences  as  follows: 

rri 

C(u)  =  ^||UpnJ-x(xJ)||2  (13) 

3  = 1 


Fig.  2.  Extrapolation  texture  mapping,  (a)  the  3D-2D  correspondences 
between  the  model  r  and  Pn~\.  (b)  the  2D-2D  correspondences  between 
Pn- 1  and  Pn.  (c)  the  3D-2D  correspondences  needed  to  find  between  the 
model  r  and  Pn. 

Similar  to  the  equation  5,  C( u)  can  be  rewritten: 

m3  3 

j= 1  k= 1  k=  1 

(14) 

where,  m  is  the  number  of  the  transferred  correspondences 
between  between  the  model  and  Pn.  By  weighted  summing 
the  terms  from  equations  14  and  11,  the  objective  function 
C(M )  can  be  therefore  written  as  follows: 

C(M)  =  C(  u)  +  sC{t) 

2  n' 

k  i=  1 

e  is  set  to  10  in  our  experiments.  The  linear  system  solver 
in  single-view  mapping  algorithm  is  employed  to  solve  the 
problem  above. 

The  advantage  of  extrapolation  texture  mapping  is  that  it  can 
deal  with  arbitrary  images,  as  long  as  there  is  a  large  amount 
of  overlap  between  Pn_i  and  Fn.  Since  there  is  no  additional 
feature  constraints  added  to  the  3D  model,  the  mapping  can 
drift  over  an  extended  sequence.  To  address  this  limitation,  we 
propose  another  method,  incremental  texture  mapping. 

Incremental  Texture  Mapping:  We  assume  the  images 
are  captured  by  a  perspective  camera.  For  each  frame  of 
video  sequence,  we  can  find  the  projection  matrix  P  between 
the  model  and  this  frame.  Thus,  we  can  use  the  projective 
constraint  to  improve  the  texturing  mapping.  Figure  3  shows 
the  per- frame  computation  procedure. 

First,  the  first  frame  at  time  To  in  the  video  sequence  is 
registered  with  the  3D  model  manually.  At  time  Tn(n  >  0),  we 
have  the  panoramic  texture  Pn_i,  the  frame  Fn  needed  to  reg¬ 
ister,  and  the  mapping  x  :  xi  —>  ui  at  Tn_i  (Figure  3  a).  First, 
the  feature  correspondences  MFn^Pn_x  =  {{upn_1:j,uFnj}) 
between  Pn_i  and  Fn  are  found  using  KFT  algorithm  (Figure 
3b).  For  each  feature  point  u Fnj,  we  can  find  the  mapping 
point  Xi?n  j  on  3D  model,  because  x  :  xi  — ►  is  known  and 
uFnj  is  in  the  set  of  {u^}.  Thus,  we  have  the  correspondences 
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Fig.  3.  incremental  texture  mapping,  (a)  the  3D-2D  correspondences  between 
the  model  r  and  Pn- 1-  (b)  the  2D-2D  correspondences  between  Pn_i  and 
Fn.  (c)  computation  of  the  projection  matrix  P  for  frame  Fn.  (d)  the  2D-2D 
correspondences  between  frame  Fn  and  Pn.  (e)  the  3D-2D  correspondences 
needed  to  find  between  the  model  r  and  Pn .  (f)  the  2D-2D  correspondences 
between  Pn_i  and  Pn 


{(xi?nj,u Fn,j)}  between  Fn  and  the  3D  model.  Given  these 
correspondences,  we  can  compute  a  3  x  4  projection  matrix 
such  that  Ui  =  P using  the  algorithm  [38]  for  estimating 
P  (Figure  3c).  Similarly,  we  can  find  the  rh  correspondences 
MFn^pn  =  {(upnJ.UF,j)}  between  Pn  and  Fn  using  KLT 
algorithm  (Figure  3d).  For  each  feature  point  u pnj  of  Pn 
in  Mpn^pn ,  the  3D  position  xpnj  can  be  found  by  the 
intersection  of  inverse  projection  rays  and  the  the  3D  model. 
Then,  the  projective  term  can  be  defined  as  follows: 

UPn  ,j  =  P*PnJ  (16) 

rh 

=  Yl  \\UPn,j  ~  X(XP„,i)||2  (17) 

J  =  1 

Similar  to  the  equation  5,  the  C(xpn)  can  be  rewritten: 

hi  3  3 

C(XPn)  =  Vi  (UPn ,  j  ~  V  V  •  Ui  k) 2  +  (VPn  ,  j  ~  V  V  '  Vi  k) ^  I 
j—1  k= 1  k= 1 

(18) 

In  addition,  we  can  use  the  constraints  that  come  from  the  ex¬ 
trapolation  texture  mapping  (Figure  3e).  The  objective  function 
C(M )  can  be  defined  by  weighted  summing  over  equation  11, 
15  and  18: 

C(M)  =  C  (u)  +  eC(r)  +  eC  (xpn ) 

2  n 

=  FJ{h-'FJak,i-Xi)2  (19) 

k  i= 1 

where  e  is  set  to  10  and  i  is  set  to  2  in  our  examples. 
Algorithm  2  summaries  the  techniques  of  the  incremental 
texture  mapping  we  discussed  above. 

Compared  to  extrapolation  texture  mapping,  incremental 
texture  mapping  significantly  improves  the  robustness  of  tex¬ 
ture  mapping,  because  it  includes  more  constraints  defined  by 
perspective  projection. 


Algorithm  2  Incremental  texture  mapping 

1.  initialize  n  =  number  of  frames  ,  i=l,  and  Pi  =  Pi. 

2.  while  i  <  n 

(1)  case  I  (i  ==  1):  perform  single- view  mapping  for  Pi. 

(2)  case  II  (1  <  i  <  n  ): 

a)  construct  panorama  Pi  using  Pi-\  and  Ft. 

b)  find  correspondences  between  P{-\  and  P*. 

c)  compute  the  projection  matrix  P  for  the  frame  Fj. 

d)  find  correspondences  Mpi^pi  between  Pi  and  P^. 

e)  compute  3d  positions  for  feature  points  in  Mp.^p^ 

e)  computer  the  z  (e.g.,  texture  coordinates)  using  the 
Eq.  19  by  the  conjugate  gradient  method. 

f)  i  =  i  +  1; 

3.  exit. 


IV.  Results 

A.  Visual  Assessment 

Based  on  the  presented  approaches  above,  we  have  im¬ 
plemented  a  flexible  system  for  texture  mapping.  It  is  an 
interactive  GUI,  in  which  the  user  edits  the  correspondences 
between  the  model  and  the  first  frame.  After  that,  the  system 
can  automatically  register  the  subsequent  frames  from  the 
video  sequence  or  the  user  can  add  more  correspondences  to 
improve  the  accuracy.  Three  examples  of  varying  complexity 
are  used  for  demonstrating  our  system.  Figure  4,  Figure  5,  and 
Figure  6  show  the  3D  model,  the  original  frames  extracted 
from  the  video  sequence,  the  intermediate  registration  result, 
and  a  view  of  the  final  texture  mapping  results  to  the  surface. 


B.  Error  Analysis 

A  video  image  is  a  projection  of  the  3D  scene  onto  a  2D 
imaging  plane.  This  can  be  modeled  using  the  pinhole  camera. 


Xu 

Xv 

X 


—  K-int  ’  Mext 


(  x  \ 

y 

Z 

1  J 


(20) 


where  x  =  (x,y,z,l)T  denotes  a  world  point  in  homogeneous 
coordinates,  u  =  (^,  u,  1)T  is  a  point  on  the  2D  image,  Kint 
denotes  the  3  x  3  intrinsic  matrix,  representing  the  projec¬ 
tion  of  3D  scene  coordinates  to  the  2D  image  plane,  Mext 
represents  the  3x4  extrinsic  matrix,  describing  the  position 
and  orientation  of  the  camera  relative  to  the  model,  and  A 
is  a  scale  factor  for  homogeneous  coordinates.  The  extrinsic 
matrix  has  six  degrees  of  freedom  (DOF),  corresponding  to 
three  translations  tx ,  ty ,  and  tz  along  the  x ,  y ,  and  z  axes, 
respectively,  and  also  three  rotations  rx,  ry ,  and  rz  about  the 
x ,  y ,  and  2  axes,  respectively. 

In  the  tracking  system,  the  geometrical  features  such  as 
points,  or  lines,  are  generally  used  to  establish  correspon¬ 
dences  between  the  2D  video  images  and  3D  models,  and 
the  correspondences  are  used  to  find  the  intrinsic  and  extrinsic 
parameters  tx,ty,tz,rx,  ry  and  rz .  The  video  image  noise  and 
the  soft  tissue  deformation  in  the  model  can  lead  to  wrong 
correspondences.  The  wrong  correspondences  are  the  primary 
source  of  the  registration  error  in  the  clinical  application,  since 
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Fig.  4.  Registration  result  of  real  data,  pig  liver,  (a)  liver  model  with  2446  vertices  and  4498  triangles,;  (b)  4  sample  frames  that  extracted  from  the  video 
sequence,  (c)  intermediate  registration  result  with  20  frames,  in  which  we  use  20  3D-2D  feature  correspondences  for  the  first  frame;  (d)  final  registration 
result  with  60  frames. 


Fig.  5.  Registration  result  of  phantom  data,  front  of  human  intestine  model,  (a)  the  front  human  intestine  model  with  9986  vertices  and  20,000  triangles  ;  (b)  4 
sample  frames  that  extracted  from  the  video  sequence,  (c)  intermediate  registration  result  with  16  frames,  in  which  we  use  16  3D-2D  feature  correspondences 
for  the  first  frame;  (d)  final  registration  result  with  40  frames. 


Fig.  6.  Registration  result  of  phantom  data,  back  of  human  intestine  model,  (a)  the  back  of  human  intestine  model  with  9986  vertices  and  20,000  triangles  ;  (b) 
4  sample  frames  that  extracted  from  the  video  sequence,  (c)  intermediate  registration  result  with  20  frames,  in  which  we  use  21  3D-2D  feature  correspondences 
for  the  first  frame;  (d)  final  registration  result  with  40  frames. 


they  result  in  the  incorrect  camera  parameters  and  camera 
pose.  These  wrong  pose  parameters  tx,---,rz  produces  an 
estimate  Q  of  the  true  registration  matrix  Q  =  Kint  *  Mext 
for  each  frame. 

To  assess  the  robustness  of  our  approach,  we  perform 
experiments  on  the  simulation  of  the  primary  sources  of 
registration  error,  image  noise  and  soft  tissue  deformation.  We 
compared  our  method  with  the  traditional  tracking  method. 
The  performance  is  analyzed  with  varying  choices  of  the 
variance  noise  on  the  images  and  the  different  variational 
surface  models.  Both  methods  produce  the  estimation  to  the 
ground  truth  of  texture  coordinates,  after  the  registration  to  all 
the  video  frames.  To  assess  the  accuracy  of  the  integration  of 
endoscopic  images  to  the  3D  model,  we  measured  the  error 
with  the  mean  and  standard  deviation  of  the  Euclidean  distance 
in  pixels  between  the  ground  truth  and  the  estimated  texture 
coordinates  after  registration. 


1 )  Simulation  of  image  noise:  The  test  was  performed  to 
investigate  how  the  performance  of  the  algorithms  varied  with 
image  noise.  In  the  simulation,  we  use  the  mean  kidney  model 
in  Figure  7  and  6  texture  images,  which  have  pre-calculated 
texture  coordinates  normalized  to  [0,1]  as  the  ground  truth. 
A  zero  mean  Gaussian  noise  of  different  standard  deviation 
values  are  added  to  the  ground  truth  of  the  texture  coordinate 
values  of  the  mean  kidney  model.  The  texture  coordinates  are 
clipped  to  still  lie  within  the  range  0-1.  This  was  repeated  with 
noise  of  standard  deviation  0.001,  0.008,  0.016,  0.024,  0.032, 
0.048  and  0.064  texture  coordinate  values.  For  each  frame, 
30  2D- 3D  correspondences  are  randomly  selected  to  estimate 
the  registration  matrix  Q ,  using  the  algorithm  in  [38].  The 
texture  coordinate  of  each  vertex  is  estimated  by  xt  =Q  *  x. 
Similarly,  we  can  have  the  estimated  texture  coordinate  of  each 
vertex  Xi  after  applying  incremental  texture  mapping  on  the 
simulated  data  with  30  3D-2D  feature  correspondences  for  the 
first  frame.  Thus,  the  mean  and  deviation  of  the  mapping  error 


DAMD-1 7-03-2-0001 
Page  110 


JOURNAL  OF  TRANSACTIONS  ON  MEDICAL  IMAGING,  VOL.  ?,  NO.  ?,  JANUARY  2009 


Fig.  7.  The  leftmost  is  the  mean  kidney  in  the  trained  shape  statistics  from  30  segmented  kidneys;  the  rest  six  models  are  among  the  49  sampled  kidney 
models  via  Monte  Carlo  sampling,  shown  from  a  different  angle. 


Fig.  8.  Mean  of  registration  error  of  our  method  against  the  traditional 
tracking  method  in  the  simulation  of  image  noise.  The  Gaussian  noise  is  added 
to  the  images  with  variance  cr  =  0.001,0.008,0.016,0.024,0.032,0.048, 
and  0.064.  The  red  line  indicates  the  results  of  our  method. 


can  be  computed.  Figure  8  and  Figure  9  show  the  registration 
error  comparison  of  our  method  against  the  tracking  method 
for  images  with  different  variance  noise. 

The  red  line  indicates  the  performance  of  our  method, 
whereas  the  blue  line  shows  the  performance  of  the  tracking 
method.  The  result  shows  the  accuracy  of  our  method  is 
better  than  the  tracking  method,  when  noise  is  introduced  to 
images.  The  reason  is  that  the  image  noise  with  small  standard 
deviation  results  in  the  wrong  correspondences,  which  leads  to 
the  large  variation  in  the  estimation  of  the  registration  matrix 
Q.  It  can  be  seen  that  as  the  noise  increases,  our  algorithm  gets 
progressively  less  accurate,  but  it  still  works  better  than  the 
tracking  method,  and  does  not  fail  completely.  Furthermore, 
the  standard  deviation  of  noise  for  the  captured  endoscopic 
images  is  seldom  higher  than  0.024. 

2 )  Simulation  of  soft  tissue  deformation:  To  simulate  the 
tissue  deformation,  we  generated  49  sample  kidney  models 
from  30  segmented  human  kidneys,  each  of  which  contains 
over  ten  thousand  vertices  and  twenty  thousand  triangles. 
Each  sampled  model  may  be  considered  to  be  reasonable 
variation  to  the  mean  kidney  model.  Figure  7  shows  some 
samples  of  these  kidney  models.  We  apply  the  tracking  method 
and  our  algorithm  to  register  these  models  to  the  6  texture 


Fig.  9.  Standard  derivation  of  registration  error  of  our  method 
against  the  traditional  tracking  method  in  the  simulation  of  image  noise. 
The  Gaussian  noise  is  added  to  the  images  with  variance  cr  = 
0.001,0.008,0.016,0.024,0.032,0.048,  and  0.064.  The  red  line  indicates 
the  results  of  our  method. 


images,  which  texture  coordinates  are  pre-calculated  to  the 
mean  kidney  model  and  other  49  kidney  models  as  the 
ground  truth.  30  3D-2D  feature  correspondences  between  the 
mean  kidney  model  and  each  image  are  randomly  selected 
for  each  image  in  the  tracking  method,  while  30  3D-2D 
correspondences  between  the  mean  kidney  model  and  he  first 
image  are  manually  selected  in  our  method.  For  each  model, 
the  mean  and  standard  deviation  of  the  registration  error  are 
calculated,  as  discussed  in  the  section  above.  The  experiment 
results  are  shown  in  Figure  10  and  Figure  11.  The  red  line 
indicates  the  performance  of  our  method,  whereas,  the  blue 
line  shows  the  performance  of  the  tracking  method.  It  shows 
that  the  robustness  of  our  algorithm  is  not  affected  much  by 
tissue  deformation;  on  the  other  hand,  the  tracking  method  is 
very  sensitive  to  deformation.  It  can  be  seen  that  our  algorithm 
is  more  robust  to  the  soft  deformation  in  the  models. 

Limitation:  As  an  incremental  method,  the  error  can  still 
accumulate  over  a  long  sequence  (a  common  problem  in 
tracking).  When  the  error  becomes  noticeable,  one  can  easily 
add  a  few  more  control  points  and  fix  the  drift  problem.  If  a 
fully  automatic  process  is  desired  (such  as  in  a  live  surgical 
setting),  fiducial  points  should  be  used.  But  unlike  existing 
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Fig.  10.  Mean  of  registration  error  of  our  method  against  the  traditional 
tracking  method  in  the  simulation  of  soft  tissue  deformation.  49  kidney  models 
are  variations  of  the  mean  kidney  models.  The  red  line  indicates  the  results 
of  our  method. 
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Fig.  11.  Standard  derivation  of  registration  error  of  our  method  against 
the  traditional  tracking  method  in  the  simulation  of  soft  tissue  deformation. 
49  kidney  models  are  variations  of  the  mean  kidney  models.  The  red  line 
indicates  the  results  of  our  method. 


approaches  that  rely  sole  on  fiducials,  our  method  can  be 
applied  to  dramatically  reduce  the  number  of  fiducial  marks 
required. 


V.  Conclusion 

Our  system  provides  a  flexible,  practical  tool  for  texturing 
arbitrary  models  from  video  sequences.  It  allows  the  user  to 
specify  any  arbitrary  number  of  correspondences  between  the 
model  and  texture  images.  The  algorithm  then  incrementally 
computes  the  texture  coordinates  of  the  vertices  of  the  model 
through  optimization,  which  satisfies  the  user-defined  con¬ 
straints.  The  texture  mapping  can  achieve  better  accuracy  with 
more  selected  correspondence  feature  points  between  the  3D 
object  and  the  2D  texture.  Our  system  has  no  hard  constraints 
on  the  number  of  texture  images  and  complexity  of  the  models. 


Fig.  12.  The  endoscope  that  is  used  in  our  experiment. 


Fig.  13.  A  simplified  example  that  demonstrates  that  the  long  offset  between 
the  endoscope  and  the  tracker  magnifies  the  tracking  error.  Et  represents  the 
tracking  error  of  the  tracker  in  images,  corresponding  to  the  minimal  error  / 3 ; 
ee  denotes  the  magnified  error  of  the  endoscope  in  images;  s  =  \OP\  is  the 
length  between  the  tip  of  the  tracker  and  its  virtual  image  plane;  d  =  \OL\ 
is  the  length  of  the  long  offset. 


Incremental  texture  mapping  is  achieved  automatically  or  with 
a  few  user-define  operations.  We  believe  our  approach  is 
particular  useful  for  image  guided  surgery,  in  which  precise 
camera  calibration  and  tracking  are  difficult. 
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VI.  Appendix 

In  a  MIS  setup,  the  imaging  sensor  on  the  endoscope  cannot 
be  directly  tracked.  A  tracking  device  is  usually  attached  to 
the  other  end  of  the  scope,  i.e.  the  end  that  is  outside  the 
body,  as  shown  in  Figure  12.  The  long  offset  between  the 
endoscope  and  the  tracker  magnifies  the  tracking  error.  It  can 
be  seen  from  a  simplified  example  in  Figure  13,  where  the 
long  offset  OL  is  collinear  with  the  principal  ray  OP  of  the 
tracker,  the  tracker  and  endoscope  have  the  same  minimal  error 
f3  and  the  same  image  resolution.  In  general,  the  tracker  can 
be  defined  as  the  mapping  from  the  3D  world  to  a  2D  image 
using  pinhole  camera  model.  When  the  tracker  has  an  error 
st  in  images,  corresponding  to  the  minimal  error  /?,  the  error 
of  the  endoscope  is  se  =  j;  *  st,  where  8  =  \OP\,  the  length 
between  the  tip  of  the  tracker  and  its  virtual  image  plane,  and 
d  =  \OL\ ,  the  length  of  the  long  offset.  When  the  tracker  has 
the  minimal  error  0.1°,  30°  FOV,  and  j  =  10,  the  endoscopic 
error  may  about  5.5  pixels  for  the  images  with  the  resolution 
640  x  640.  A  detailed  quantitative  analysis  are  here. 
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As  shown  in  Figure  14,  the  calibration  process  of  endoscope 
includes  two  parts:  the  first  one  consists  of  the  determination 
of  intrinsic  parameters  of  the  endoscope  K.  The  second  one 
consists  of  computing  and  refining  the  transformation  Mw^c 
between  the  endoscope  and  the  tracker.  The  tracker  can  be  rep¬ 
resented  with  the  matrix  Pw^o  =  K[R\t\,  where  t  =  — RC , 
and  C  represents  the  coordinates  of  the  tracker  tip  in  the  world 
coordinate.  Similarly,  we  can  represent  the  endoscope  with 
Pc^o  =  K[R\t\Mw<-+c>  where  [R\t\  is  the  external  matrix, 
and  R  is  a  3  x  3  rotation  matrix  representing  the  orientation 
of  the  tracker.  The  assumption  we  make  here  is  that  the  K 
and  Mw^c  are  constants  and  no  error  is  introduced  by  them 
after  the  calibration.  We  define  K[R'\t']  is  the  noisy  projection 
matrix  of  the  tracker,  corresponding  to  the  correct  projection 
matrix  K[R\t\.  In  the  same  way,  the  noisy  projection  matrix 
of  the  camera  is  K[R'\t']Mw^c  with  the  correct  projection 
matrix  K[R\t\Mw<-+c-  The  error  of  the  tracker  for  a  pixel  can 
be  defined  as  K([R'\t']/ai  —  [R\t] /a2)x,  and  the  error  of  the 
camera  for  a  pixel  is  K([R'\t']/as  —  [R\t\/ a^Mw^c^  where 
x  is  a  point  in  the  world  coordinate  system,  c^(i  =  1,  •  •  • ,  4) 
are  scales  for  pixels’  homogeneous  coordinates.  To  measure 
the  error,  we  consider  maximizing  the  following  objective: 


t/  \  _  WPtMw^cxW2 

J\X>-  1 1  D  „l  12 


(21) 


where  Pt  =  K([R'\t']/ai  —  [R\t\/ct2),  and  Pc  = 

K ( [Rf  1 1 f]  /as  —  [R 1 1\  /a±) .  An  important  property  of  J  is  that  it 
is  invariant  with  respect  to  rescalings  of  x.  For  this  reason,  we 
can  transform  the  problem  of  maximizing  J  into  the  following 
constrained  optimization  problem, 


min  -\xT  M^^P?  PtMw^cx 

S.t.  XTPjPcX  =  1 

corresponding  to  the  lagrangian, 


£P  =  -]-xTM^cP?PtMw^cx  +  \\{xTPjPcx  -  1) 

(23) 

By  applying  the  KKT  conditions,  we  get  the  following  equa¬ 
tion  needs  to  hold  at  the  solution, 

M^^cPj  PtMw^cx  =  XPj  Pcx  (24) 

This  is  a  generalized  eigenvalue  problem.  Plugging  the 
solution  back  into  the  objective  J,  we  find, 


J(x) 


\\PtMw~c*\l 

||-Pcx||2 

XTP?PCX  _  , 
XTP?PCX  ' 


XT  M^^cPf  PtMw~cx 

XTPJPCX 


(25) 


from  which,  it  immediately  follows  that  the  maximum  of 
magnification  of  the  endoscope  error  with  respect  to  the  tracker 
error  is  a/ A max.  The  analysis  above  shows  how  the  long  offset 
between  the  tracker  and  the  endoscope  affects  the  tracking 
method. 


Robot/Tracker 


Object  coordinate 
system 


Fig.  14.  Diagram  to  illustrate  the  calibration  process  of  endoscope.  Mw^c 
represents  the  transformation  between  the  endoscope  and  the  tracker;  Pw^o 
denotes  the  tracker  model;  Pc^o  stands  for  the  endoscope  model. 
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ABSTRACT 

The  incorporation  of  pre-operative  images  into  minimally  in¬ 
vasive  surgeries  (MIS)  can  provide  surgeons  a  global  field  of 
view  and  important  visual  cues  that  were  not  available,  such 
as  a  tumor  mostly  hidden  inside  a  target  object.  This  paper 
focuses  on  using  a  geometric  deformation  model  to  guide  the 
registration  of  a  pre-built  3D  model  from  pre-operative  im¬ 
ages  to  a  small  set  of  reconstructed  3D  surface  landmarks 
from  a  laparoscopic  video  sequence.  Intra-operative  tissue 
deformations  of  the  target  object  are  used  to  estimate  defor¬ 
mations  of  important  substructures,  such  as  a  tumor  or  a  group 
of  vessels,  of  the  target  object.  Evaluation  results  of  the  regis¬ 
tration  and  the  deformation  estimation  show  that  our  method 
performs  well  on  the  type  of  soft  tissue  deformations  in  our 
driving  application. 

Index  Terms —  Deformable  Model,  Geometric  Deforma¬ 
tion  Model,  Intra-operative  Navigation,  Minimally  Invasive 
Surgery 

1.  INTRODUCTION 

Minimally  invasive  surgery  (MIS)  has  the  advantage  of  induc¬ 
ing  less  trauma  to  patients.  However,  the  narrow  field  of  view 
provided  by  an  endoscopic  or  laparoscopic  camera  is  one  of 
the  limiting  factors  of  this  technique.  Recent  research  work 
has  focused  on  incorporating  preoperative  images  into  MIS  to 
provide  more  information  intra-operatively. 

[1]  proposed  to  reconstruct  a  dense  3D  surface  from 
intra-operative  data  by  Markov  Random  Field  (MRF)  based 
Bayesian  belief  propagation  for  smoothly  fusing  multiple 
depth  cues,  which  requires  repeatable  tracking  of  a  large 
number  of  feature  points.  In  [2],  a  statistical  deformation 
model  was  built  from  simulated  deformations  of  a  finite  el¬ 
ement  model  of  prostates.  As  an  extension  to  [2],  Hu  et  al. 
included  material  properties  in  their  simulated  deformations 
and  used  a  pre-leamed  statistical  motion  model  to  predict  a 
displacement  field  over  an  entire  prostate  volume  based  on 
surface  landmarks  on  the  prostate  [3]. 

This  paper  proposes  a  method  to  estimate  and  visualize 
intra-operative  tissue  deformations  of  a  target  anatomical 


object.  Our  method  only  requires  a  small  number  of  recon¬ 
structed  surface  landmarks  on  the  object,  and  it  does  not 
require  modeling  material  properties.  The  estimated  defor¬ 
mations  of  the  target  object  are  applied  to  substructures  of 
the  target  object,  such  as  a  tumor  mostly  or  entirely  hidden 
within  the  surface  of  the  target  object.  One  of  our  driving  ap¬ 
plications  is  laparoscopic  renal  cryoablation  on  small  tumors. 
A  3D  visualization  of  the  renal  tumor  during  a  laparoscopic 
cryoablation  can  greatly  increase  the  positioning  accuracy 
of  the  ablation  site,  compared  with  the  current  ultra-sound 
guided  procedure. 

In  our  method,  a  specific  geometric  deformation  model  is 
built  for  each  target  kidney  to  guide  the  registration  of  the  pre¬ 
built  kidney  model  to  a  small  set  of  reconstructed  3D  surface 
points  from  a  laparoscopic  video  sequence.  Once  the  model 
itself  is  registered,  its  deformations  are  applied  to  the  tumor 
model.  The  tumor  can  then  be  visualized  intra-operatively.  If 
pre-planned  needle  insertion  path  is  available,  they  can  also  be 
transformed  into  the  camera  view  to  better  guide  the  ablation 
procedure. 

Next  section  details  our  proposed  method  by  its  three 
main  components:  the  geometric  deformation  model,  the 
registration  of  the  3D  model,  and  the  implied  deformations 
for  the  substructure(s).  Section  3  describes  the  evaluation  of 
the  proposed  method.  Section  4  concludes  this  paper  with 
discussions. 

2.  METHOD 

In  order  to  incorporate  the  pre-operative  scans  efficiently  for 
a  specific  patient,  we  adopt  a  deformable  model  framework. 
The  main  components  of  our  method  are  shown  as  follows: 

1.  Building  a  3D  model  from  the  pre-operative  scan  im¬ 
age; 

2.  Building  a  geometric  deformation  model  for  the  3D 
model,  controlled  by  a  small  number  of  parameters; 

3.  Identifying  a  small  set  of  surface  feature  points  of 
anatomical  significance  on  the  target  object  from  the  la¬ 
paroscopic  video  sequence,  tracking  the  feature  points, 
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and  reconstructing  3D  landmarks  from  these  tracked 
feature  points; 
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4.  Registering  the  pre-built  3D  model  to  the  reconstructed 
3D  landmarks  using  the  geometric  deformation  model; 

5.  Applying  the  deformation  of  the  registered  3D  model  to 
its  internal  or  adjacent  substructures,  such  as  a  tumor. 

The  3D  model  is  built  from  segmented  pre-operative  CT 
images.  At  this  moment,  the  image  segmentation  is  manually 
contoured  by  experts,  and  then  a  3D  deformable  model  is  fit¬ 
ted  into  the  segmentation  using  the  binary  fitting  method  de¬ 
scribed  in  [4].  The  discrete  m-rep  [5]  is  chosen  as  the  shape 
model  because  of  its  unique  property  of  modeling  both  the 
surface  and  the  interior  volume  of  an  object. 

For  the  3D  reconstruction  from  a  laparoscopic  video  se¬ 
quence,  a  small  set  of  anatomical  points  are  identified  by 
surgeons  intra-operatively  from  the  video  sequence  and  are 
tracked  throughout  the  procedure.  Artificial  markers  can  be 
used  to  increase  the  accuracy  of  the  feature  tracking.  This 
set  of  tracked  points  are  then  fed  into  a  structure  from  motion 
(SFM)  toolkit  to  reconstruct  their  corresponding  3D  surface 
landmarks.  Although  this  step  is  not  the  main  focus  of  this 
paper,  the  accuracy  of  the  3D  reconstruction  is  crucial  to  the 
following  steps.  Ongoing  research  is  being  conducted  to  im¬ 
prove  the  reconstruction  accuracy  of  the  landmarks. 

This  paper  focuses  on  components  2,  4,  and  5  described 
above.  Based  on  the  assumption  that  a  kidney  goes  through 
moderate  global  shape  deformations  before  the  needle  inser¬ 
tion  for  a  cryoablation,  a  geometric  deformation  model  is  de¬ 
fined  to  describe  the  global  deformations,  including  bending 
and  twisting.  The  modes  of  deformations,  controlled  by  3  pa¬ 
rameters,  are  used  as  a  shape  prior  model  to  guide  the  regis¬ 
tration  of  the  pre-built  model  to  the  reconstructed  set  of  land¬ 
marks. 

The  advantage  of  our  method  is  that  only  a  small  set  of 
landmarks  is  needed,  so  it  is  feasible  to  use  artificial  markers 
to  ensure  robust  tracking  of  the  corresponding  feature  points 
and  thus  to  ensure  accurate  reconstruction  of  the  landmarks. 
The  rest  of  this  section  details  components  2,  4,  and  5. 

2.1.  Definition  of  a  geometric  deformation  model 

A  discrete  m-rep  M  consists  of  a  quad-mesh  of  basic  compo¬ 
nents  called  medial  atoms  m^,  i  =  1,2,...,  nm.  Each  internal 
atom  m^  has  a  hub  position  p^,  two  spokes  S^1,_1  with  length 
Vi  and  directions  1,-1 .  Atoms  at  the  edge  of  the  quad-mesh 
also  has  a  third  spoke  S°  with  length  of  pr  and  direction  U°. 
More  details  of  the  m-rep  can  be  found  in  [5]. 

To  define  a  geometric  deformation  model  for  an  m-rep 
M,  two  axis  directions  for  M  are  defined  to  form  a  reference 
frame.  Assume  the  set  of  hub  positions  and  spoke  end  points 
of  all  medial  atoms  in  M  is  S  =  {s^,  j  =  1,  2, ...,  n^}.  Prin¬ 
cipal  component  analysis  (PCA)  is  applied  to  the  points  in  S, 


Twisting 


Fig.  1.  Demonstration  of  the  3  modes  of  deformations  of 
bending  and  twisting:  each  row  shows  one  mode  from  —A, 
0,  to  A,  from  left  to  right. 


and  the  first  two  principal  directions,  Ai  and  A2  correspond¬ 
ing  to  the  biggest  two  eigenvalues  of  the  covariance  matrix, 
are  defined  as  the  axis  directions  of  the  local  frame.  The  ’’ori¬ 
gin”  O  of  the  frame  is  set  at  the  arithmetic  mean  of  points  in 
S.  3  geometric  deformations  are  defined  for  each  M  based  on 
the  reference  frame.  These  3  deformations  are  chosen  by  ex¬ 
perienced  urologists  with  a  comprehensive  understanding  of 
the  deformations  of  kidneys: 

•  Bi  relative  to  Ai:  for  each  atom  m^,  let  the  distance 
from  its  hub  position  to  axis  Ai  be  d^i,  atom  is 
rotated  around  Ai  by  a  degree  of  a x; 

•  B2  relative  to  A2:  for  each  atom  mi,  let  the  distance 
from  its  hub  position  and  axis  A2  be  d^2,  atom  m*  is 
rotated  around  A2  by  a  degree  of  a2df  2; 

•  Tw:  project  the  hub  position  of  each  medial  atom  m* 
into  the  reference  frame  spanned  by  Ai,  A2.  Assume 
the  projected  point  is  (x-,  y[),  so  is  rotated  around 
the  axis  Ai  by  a  degree  of  / 3x 

where  ol\ ,a2,/3  are  normally  distributed  random  variables 
controlling  the  amount  of  bending  relative  to  Ai,  bending  rel¬ 
ative  to  A2,  and  twisting  Tw,  respectively.  These  three  defor¬ 
mations  are  demonstrated  in  figure  1  for  one  kidney  model. 

The  means  of  ai,  a2,  and  /?,  are  all  set  0,  and  the  vari¬ 
ances  of  them  are  0.3,  0.15,  and  0.2,  respectively.  These  val¬ 
ues  are  empirically  determined  to  ensure  the  combined  defor¬ 
mation  applied  to  M  does  not  yield  any  illegal  shapes  with 
local  shape  defects,  such  as  folding.  100  sets  of  (a\ ,  a2,/?) 
are  sampled  from  the  3  zero-mean  normal  distributions.  The 
combined  deformation  D  =  TwB2Bi  is  applied  to  M  to  get 
M'  =  Do  (M).  Principal  geodesic  analysis  (PGA)  [6]  is 
applied  to  the  100  deformed  m-reps  to  form  a  deformation 
prior  model,  given  as  a  Frechet  mean  M,  the  first  few  princi¬ 
pal  geodesic  directions  representing  more  than  99%  of  the 
total  shape  variations,  and  the  corresponding  variances  \h  to 
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the  principal  geodesic  directions.  In  practice,  the  first  3  prin¬ 
cipal  geodesic  directions  and  variances  =  1,  2,  3  are 

used  to  describe  the  deformation  model:  {M|A;,  z^}. 

2.2.  Model  registration 

The  mean  model  M  is  registered  (fitted)  to  a  set  of  recon¬ 
structed  landmarks  L  =  {1  ii  =  1,2,  ...,n/}.  For  each  1^, 
there  is  a  corresponding  point  on  the  implied  surface  of 
M.  The  fitting  is  implemented  by  minimizing  an  objective 
function: 


M'  =  argrmnF(M7({ci,j  =  1,2,3})) 

where  M7  is  constructed  by  M  and  a  tangent  vector  Y^j= i  cjvj 
via  the  exponential  map  [6],  where  Cj  is  the  set  of  parameters 
controlling  the  deformations.  F(M7)  has  three  components, 
F(M')  =  +  t2Freguiarization{M')  +  (1  —  t\  — 

t2)Fieg(M'),  with  f i ,  and  t\  +  t2  G  (0, 1)  as  two  tuning 
parameters: 

•  Ffu  =  (<Mk*(M  )’h)  )2  measuring  the  fitting  qual- 

ity  of  the  model  to  the  set  of  landmarks,  where  dis  is 
the  Euclidean  distance  between  a  landmark  1^  and  its 
corresponding  surface  point  on  the  implied  surface 
of  M7,  and  where  rmean  is  the  geometric  mean  of  the 
radii  of  all  medial  atoms  in  M7  and  is  used  to  convert 
Ffu  to  be  unitless,  commensurate  to  the  other  2  com¬ 
ponents  of  the  objective  function; 


— Estimation  Errors  (Average 
surface  distance  in  mm) 


— Registration  Errors  (Average 
surface  distance  in  mm) 


Fig.  2.  Sorted  evaluation  results  of  registered  kidney  models 
or  estimated  tumor  models  in  terms  of  average  surface  dis¬ 
tances  (ASD)  between  the  registered  or  estimated  models  and 
their  corresponding  ground  truth  surface  meshes.  For  almost 
all  testing  cases,  the  ASDs  for  the  implied  deformations  to 
tumors  are  bigger  than  the  ASDs  for  the  registered  kidney 
models. 


is  applied  to  the  tumor  volume,  voxel  by  voxel.  Because  of  the 
enforced  legality  of  the  deformed  M7  by  the  illegality  penalty 
term  fieg  in  the  objective  function,  the  volumetric  legalities 
of  both  of  the  models,  before  and  after  the  registration,  are 
guaranteed.  Therefore,  the  implied  deformation  field  is  guar¬ 
anteed  to  be  legal,  without  any  local  folding.  Next  section 
evaluates  the  proposed  method  and  shows  the  results. 


3.  RESULTS 


•  Fmaha  =  i(\r)2  *s  the  squared  Mahalanobis  dis¬ 

tance  of  the  current  model  M7  to  the  mean  model  M, 
penalizing  large  deformations  of  M7; 

•  Fieg  =  5]JSi/zep( m-),  where  m7  is  a  medial  atom  in 
M7,  and  where  fieg  is  the  illegality  penalty  term  defined 
by  equation  (12)  in  [4]. 

The  overall  objective  function  is  optimized  over  principal 
geodesic  components  Cj  in  the  deformation  model,  while  big 
deformations  and  shape  illegalities  are  penalized.  To  initial¬ 
ize  the  optimization,  {cj}  are  all  set  0,  and  M  is  aligned  to 
the  set  of  landmarks  L  by  a  rigid  transformation  plus  scal¬ 
ing.  The  objective  function  is  optimized  via  conjugate  gra¬ 
dient  method.  With  only  3  parameters,  the  optimal  solution 
converges  fast,  normally  within  30-40  iterations. 


In  order  to  evaluate  the  proposed  method,  a  set  of  kidney  mod¬ 
els  with  synthetic  deformations  is  generated.  Synthetic  data 
provide  the  ground  truth  to  measure  the  registration  and  esti¬ 
mation  errors. 

Learned  PGA  shape  statistics  from  50  kidney  models  are 
sampled  by  the  Monte  Carlo  sampling  scheme  [7]  to  generate 
100  testing  kidney  m-rep  models  {M$,  i  =  1,2,  ..,100}.  A 
reference  frame  represented  by  A^i  and  A^2  centered  at  O*  is 
calculated  for  each  kidney  m-rep  is  then  transformed 

to  align  its  reference  frame  to  standard  axes  (y,z)  centered  at 
the  origin.  Each  kidney  volume,  implied  by  the  m-rep  M^,  is 
warped  by  a  diffeomorphic  deformation  that  is  defined  inde¬ 
pendently  from  the  m-rep.  We  used  a  same  set  of  deforma¬ 
tions,  including  bending  and  twisting,  to  keep  the  synthesized 
deformations  close  to  the  real  kidney  deformations.  Each  dif¬ 
feomorphic  deformation  T  is  defined  as  follows: 


2.3.  Deformation  application  to  important  substructures 

The  models  before  and  after  the  registration  are  used  to  imply 
a  deformation  field  for  the  interior  and  the  adjacent  exterior 
volume  of  the  target  object.  An  m-rep  parameterizes  the  inte¬ 
rior  and  adjacent  exterior  volume.  When  the  original  m-rep  is 
built  from  a  segmented  pre-operative  image,  its  substructures, 
such  as  a  tumor,  are  also  built.  The  implied  deformation  field 


x'  =  x  (1) 

y'  =  y  cos(/3x)  —  z  sm(/3x)  +  a2x2  (2) 

z'  =  ysin(/3x)  +  zcos(/3x)  +  olix2  (3) 

where  c^i,a2  and  (3  are  3  random  variables  controlling  the 

deformation. 


DAMD-1 7-03-2-0001 
Page  117 


Original  Warped  Fitted 


Fig.  3.  3  testing  kidney  m-reps,  from  left  to  right:  the  origi¬ 
nal  kidney  m-rep  (in  blue),  the  ground  truth  surface  mesh 
of  the  kidney  and  tumor  (in  red)  reconstructed  from  warped 
object  volume,  and  the  fitted  (registered)  m-rep  M-  with  the 
deformation  applied  to  its  attached  tumor  model. 

Because  the  Jacobian  determinant  of  T  is  1,  T  is  indeed 
a  diffeomorphic  deformation.  A  tumor  m-rep  Mtumor  is  at¬ 
tached  to  each  testing  kidney  m-rep  M*.  A  randomized  T* 
is  applied  to  warp  the  volumes  of  and  Mtumor.  Sur- 
face  meshes  S'kidneyA  and  S'tumorA  are  reconstructed  from 
the  warped  kidney  and  tumor  volumes  as  the  ground  truth, 
respectively. 

A  set  of  6  landmarks  L i  are  automatically  identified  on 
each  deformed  kidney  surface  S'kidney  i.  Using  the  proposed 
method,  each  model  is  fitted  to  the  set  of  landmarks  L i  to 
form  a  registered  kidney  model  M' .  The  registration  error  is 
measured  by  the  average  surface  distance  between  the  implied 
surface  mesh  of  M-  and  the  ground  truth  surface  S'kidney  i. 
The  deformation  between  and  M  •  is  then  applied  to  the 
tumor  Mturnor  to  form  an  estimated  tumor  model  M-  tumor. 
The  estimation  error  is  measured  by  the  average  surface  dis¬ 
tance  between  the  implied  surface  mesh  of  M  •  turnor  and  the 
ground  truth  surface  S^umor  ^  The  evaluation  results  of  the 
registration  and  estimation  are  shown  in  figure  2. 

The  average  surface  distance  is  within  the  range  of  1.35  — 
2.17 mm,  which  is  comparable  to  the  reports  results  in  [3]. 
Three  testing  kidneys  are  shown  in  figure  3. 

4.  DISCUSSION 

The  evaluation  results  show  that  our  proposed  registration  and 
deformation  estimation  method  performs  well  against  sim¬ 
ulated  kidney  models  with  moderate  deformations.  We  are 
working  on  evaluating  the  proposed  method  on  captured  la¬ 
paroscopic  video  sequences.  Compared  with  previously  pro¬ 
posed  methods,  our  method  uses  a  geometric  deformation 
model.  Our  method  is  based  on  the  reconstruction  of  a  small 
set  of  3D  landmarks  to  estimate  moderate  soft  tissue  defor¬ 
mations.  Given  the  small  number  of  landmarks,  it  is  feasible 
to  use  robust  markers  to  significantly  improve  the  quality  of 
tracking,  so  we  expect  to  have  robust  reconstructions  in  future 
experiments.  Another  direct  application  of  our  method  is  sur¬ 


gical  planning.  For  example,  with  the  renal  tumor  accurately 
located,  pre-planned  needle  insertion  paths  can  be  adapted  to 
the  tissue  deformations  in  real  time. 

One  limitation  of  our  method  is  that  it  is  not  designed  to 
directly  estimate  large  deformations.  Although  the  current 
setup  is  suitable  for  our  driving  applications,  such  as  renal 
cryoablation,  in  which  the  kidney  can  be  controlled  to  have 
moderate  deformations,  a  means  to  extend  our  method  to  han¬ 
dle  large  tissue  deformations  is  to  extend  the  geometric  defor¬ 
mation  model  to  a  statistical  deformation  model  that  could  be 
learned  from  sufficient  amount  of  training  data. 
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The  Maryland  Virtual  Patient  (MVP)  project  aims  to  create  an  environment  where  a  physician 
user  can  manage  a  complex  virtual  patient  who  is  suffering  from  one  or  more  diseases.  Once 
developed,  this  environment  will  allow  for  multiple  capabilities,  including  trial-and-error 
learning,  tutored  learning,  and  assistance  with  problem  solving  in  treating  real  patients.  Last  year 
we  presented  our  progress  by  demonstrating  virtual  patients  suffering  from  complex  esophageal 
diseases  that  progressed  over  time.  These  patients  behaved  in  a  clinically  appropriate  fashion  and 
reacted  in  realistic  ways  to  interventions.  The  physician  could  use  drop-down  menus  to  select 
queries,  tests  and  interventions,  as  well  as  observe  the  responses  and  subsequent  disease 
progression.  Simulation  was  supported  by  an  intelligent  agent  in  the  MVP  whose  main 
component  is  a  model  of  normal  and  abnormal  physiology.  During  the  past  year,  we  have 
developed  three  additional  components  of  the  intelligent  agent:  a)  two  types  of  perception: 
perception  of  stimuli  originating  inside  of  the  body  (interoception)  and  the  perception  of  natural 
language  communication  (language  perception),  b)  a  model  of  cognitive  decision-making  and  c) 
a  model  of  verbal  and  simulated  physical  action. 

Perception:  The  physiological  component  of  the  agent  communicates  with  the  cognitive 
component  as  simulated  interoception  to  produce  symptom  perception  by  the  cognitive  agent. 
Language  perception  allows  the  cognitive  agent  to  understand  the  meaning  of  inputs  it  obtains 
from  the  physician  user. 

Cognitive  Decision-Making:  This  module  of  the  system  uses  several  types  of  input  and 
knowledge  to  model  agent  decision  making,  including:  interoception;  input  from  the  physician; 
the  resident  knowledge  possessed  by  the  agent;  and  the  agent’s  personality  traits. 

Simulated  Action:  In  the  current  simulation,  the  action  can  be  verbal  (for  example,  a  response  to 
the  physician’s  question),  or  simulated  physical  (for  example,  taking  medication  or  presenting  at 
the  physician’s  office).  Our  objective  is  to  model  these  actions  in  a  way  that  would  be  natural  for 


people. 


The  variables  used  in  building  this  model  of  an  intelligent  virtual  patient  include: 

•  Life  goals,  such  as  the  desire  to  be  healthy 

•  Character  traits,  such  as 

o  Attitude  toward  visits  to  the  physician 
o  Courage  to  be  treated 
o  Trust  in  the  physician’s  skill 

o  Suggestibility  with  respect  to  the  physician’s  recommendations 

•  Physiological  traits,  such  as 
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o  Tolerance  to  pain 
o  Tolerance  to  symptoms 
o  Tolerance  to  external  stressors 
•  Intellectual  traits,  such  as 

o  Memory/forgetfulness  of  symptoms  and  events 
o  Knowledge  about  the  disease 
o  Knowledge  about  the  tests  and  interventions 

o  Retention  of  knowledge  gleaned  in  conversation  with  the  physician. 

Language  Processing  Capabilities. 

When  free-text  input  is  received  from  the  physician,  it  is  interpreted  and  assigned  a  formal  text 
meaning  representation.  During  this  process,  both  the  meaning  and  intention  of  the  input  are 
determined.  Indirect  speech  acts,  such  as  implied  questions  not  stated  in  a  question  format,  are 
handled  appropriately.  Physician  input  that  is  currently  interpreted  by  the  MVP  agent  includes: 

1 .  a  request  for  physiological  data  (test  results),  perception  of  symptoms,  and  memory  of 
health  events 

2.  a  request  for  permission  to  carry  out  a  test  or  intervention 

3.  a  request  to  perform  an  intervention 

4.  a  request  to  return  to  see  the  physician  at  a  later  date 

5.  a  response  to  an  MVP  question 

6.  unsolicited  knowledge  provided  to  the  MVP  by  the  physician 

Natural  language  output  is  provided  to  the  physician  after  processing  by  the  MVP  agent.  All 
responses  are  based  upon  evaluations  of  the  physiological  state  of  the  MVP  in  concert  with  the 
perception  of  symptoms  and  cognitive  functions.  Potential  MVP  agent  output  includes: 

1 .  providing  requested  data 

2.  requesting  additional  information 

3.  inquiring  about  other  treatment  options  available  at  a  given  time 

4.  agreeing  to  or  refusing  to  submit  to  the  physician’s  suggestion  for  a  test  or  intervention 

5.  storing  new  knowledge  in  the  MVP  memory 

6.  presenting  to  the  physician  in  response  to  an  intolerable  state  of  health  (as  defined  by  the 
MVP)  for  a  first  or  subsequent  visit 

7.  presenting  to  the  physician  at  a  later  date  in  response  to  the  physician’s  request 

We  will  demonstrate  this  process  by  communicating  in  natural  language  with  two  simulated 
patients.  The  first  patient  is  a  knowledgeable  individual  who:  a)  foresees  additional  information 
the  physician  might  desire,  such  as  the  frequency  of  a  symptom  when  asked  if  he  experiences 
that  symptom,  and  b)  desires  a  lot  of  additional  information  about  what  the  user  proposes  to  do. 
In  addition,  the  first  patient  learns  from  the  encounter,  so  that  the  next  time  the  physician 
suggests  an  intervention  which  the  patient  already  knows  about,  the  patient  agrees  without 
further  questioning  because  he  remembers  the  results  of  the  original  decision-making  process. 
The  second  patient:  a)  is  a  trusting  individual  who  essentially  agrees  to  all  suggestions  with  no 
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questions  asked  and  b)  responds  to  all  questions  quite  literally,  not  providing  any  additional 
explanatory  information. 

In  summary,  we  will  display  several  accomplishments: 

1 .  The  MVP  retains  its  previous  physiological  complexity  while  gaining  the  ability  to 
communicate  in  natural  language  and  to  incorporate  interoception,  cognitive  traits  and 
behavioral  traits  in  its  own  health  care  decision-making 

2.  The  MVP  has  personality  traits  that  give  it  curiosity,  free-will  and  other  human 
characteristics  when  responding  to  the  physician 

3.  The  MVP  can  learn  from  explanations  and  actions  by  the  physician  and  use  that 
knowledge  in  future  decisions  about  health  care 

4.  The  MVP  interaction  involves  a  conversation  that  is  becoming  very  realistic. 
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Why  Do  We  Need  Ergonomics  in  MIS? 

■  Patients  benefit  while  surgeons  suffer 

■  Patients:  Less  traumatic  alternative  to  open  surgery 

■  Surgeons:  Ergonomic  problems  are  not  inevitable 

■  Recent  on-line  survey  study  (Park  etai.,  acs  2008) 

■  -90%  physical  discomfort  or  complications 

■  Complication  report  rate  is  correlated 

■  with  number  of  cases  per  year 

■  not  with  age  or  practicing  years 

■  Examples  (Kanoetal,  1993,  1995;  Cuschieri,  1995;  Bergueretal,  1999) 

■  Increased  upper  extremity  fatigue 

■  Carpal  tunnel  syndrome  &  neuropraxia 

■  Problems  in  neck,  lower  back,  and  lower  extremities 

■  Surgeons  deserve  ergonomically  friendly  environment 
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Research  Goals 

Surgical  Ergonomics  Laboratory  at  the  MASTRI  Center 

■  Better  ergonomics 

■  To  identify  the  ergonomic  risk  factors  associated  with  laparoscopy 
and  find  solutions  (physical  and  cognitive  ergonomics) 

■  Better  environment 

■  To  evaluate  ergonomic  impact  and  efficacy  of  new  technologies 
and  development 

■  New  laparoscopic  instruments  and  training  station 

■  Better  training 

■  To  create  standard  matrices  of  expert  surgical  movements  from 
analysis  of  the  characteristics  of  surgical  movements 

■  To  develop  more  effective  training  protocol  based  on  the  standard 
matrices 


What  Makes  Us  Unique? 

■  Integration  of  state-of-the-art 
experiment  systems 

■  12-camera  Vicon™  motion  analysis 
system 

■  2  AMTI™  force  plates 

■  16-channel  Delsys™  EMG  system 

■  Custom  data  analysis  programs 

■  Systematic  biomechanical 
approaches 

■  Postural  sway  area  analysis 

■  Postural  Stability  Demand  (PSD) 

■  Biomechanical  knee  model 
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Past  Projects 


■  Compensatory  and  strategic  movement  analysis  during  FLS  tasks 

■  Postural  instability  represented  by  increased  postural  sway  does  not  necessarily 
correlate  to  poor  surgical  performance 

■  Postural  sway  analysis  during  FLS  tasks 

■  More  experienced  MIS  surgeons  used  task-specific  and  skill-related  postural 
control 

■  Surgical  movement  analysis 

■  Experienced  surgeons  showed  unique  sets  of  joint  movements 

■  Assessment  of  optimal  OR  table  height 

■  Different  OR  Table  heights  were  required  for  laparoscopic  instruments  of 
different  designs  (pistol  grip  and  in-line  grip) 

■  Assessment  of  surgical  assistant  during  simulated  surgery 

■  Due  to  the  leaning  posture  and  extended  arm  position,  the  supporting  leg  of  the 
surgical  assistant  disproportionately  bore  70-80%  of  whole  body  weight 

■  Online  ergonomic  survey  study 

■  Comprehensive  survey  study  performed  with  317  practicing  MIS  surgeons 


Ongoing  Projects 

■  Postural  analysis  of  laparoscopic  surgeons  in  OR  theater 

■  Force  plate  data  collection  during  surgical  cases 

■  Ergonomic  assessment  of  one-handed  vs.  two-handed  surgical 
technique  during  VR  cholecystectomy 

■  Determining  which  technique  is  more  ergonomically  sound  and  which  is 
best  used  in  training 

■  Assessment  of  ergonomics  related  to  NOTES 

■  Comprehensive  comparison  between  traditional  laparoscopy,  flexible 
endoscopy,  and  innovative  NOTES  platform 

■  Longitudinal  study  of  clinical  fellowship  program 

■  Comparing  surgical  skill  levels  of  research  versus  clinical  fellows 

■  Ergonomic  assessment  of  new  laparoscopic  instruments 
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What  are  our  long-term  goals? 


■  Expanding  our  research  boundary  to 
unexplored  or  less  explored  territories 

■  New  approaches 

■  Development  of  analysis  metrics  for  data  collected 
over  long-period  of  time  (e.g.  muscle  workload) 

■  Biomechanical  models  to  better  understand 
etiologies  of  physical  symptoms 

■  New  research  areas 

■  Ergonomics  in  Flexible  Endoscopy 

■  Ergonomics  in  NOTES 

■  Ergonomics  associated  with  OR  staff  (other  than 
surgeons) 


What  are  our  long-term  goals?  (cont.) 

■  Delivery  of  research  outcomes  for  the  real 
world 

■  Application  of  research  for  better  surgical  training 

■  Self-mentoring  surgical  training  system  with  augmented 
reality 

■  Collaboration  of  surgical  ergonomics  group  with  smart  image 
group 

■  Collaborations  with  industrial  partners 

■  Surgical  instrument  designs  and  testing 

■  Intra-operative  instrument  tracking 

■  Outreach 

■  Presentations  at  national/international  conferences  & 
published  manuscripts 

■  Recommendations  and  guidelines,  professional  societies, 
grand  rounds 
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